Gene construct encoding a heterologous prodrug-activating enzyme and a cell targeting moiety

ABSTRACT

The invention provides a gene construct encoding a cell targeting moiety and a heterologous prodrug activating enzyme for use as a medicament in a mammalian host wherein the gene construct is capable of expressing the cell targeting moiety and enzyme as a conjugate within a target cell in the mammalian host and wherein the conjugate is directed to leave the cell thereafter for selective localisation at a cell surface antigen recognised by the cell targeting moiety.

This application is a national stage application filed under 35 U.S.C.371 of International Patent Appln. No. PCT/GB98/01294, filed May 5,1998.

This invention relates particularly to gene directed enzyme prodrugtherapy (GDEPT) using in situ antibody generation to provide enhancedselectivity, particularly for use in cancer therapy.

Known gene therapy based prodrug therapeutic approaches includevirus-directed enzyme prodrug therapy (VDEPT) and gene-directed enzymeprodrug therapy (GDEPT), the latter term encompassing both VDEPT andnon-viral delivery systems. VDEPT involves targeting tumour cells with aviral vector carrying a gene which codes for an enzyme capable ofactivating a prodrug. The viral vector enters the tumour cell and enzymeis expressed from the enzyme gene inside the cell. In GDEPT, alternativeapproaches such as microinjection, liposomal delivery and receptormediated DNA uptake as well as viruses may be used to deliver the geneencoding the enzyme.

In both VDEPT and GDEPT the enzyme gene can be transcriptionallyregulated by DNA sequences capable of being selectively activated inmammalian cells e.g. tumour cells (EP 415 731 (Wellcome); Huber et al,Proc. Natl. Acad. Sci. USA. 88, 8039-8043, 1991). While giving somedegree of selectivity, gene expression may also occur in non-targetcells and this is clearly undesirable when the approach is being used toactivate prodrugs into potent cytotoxic agents. In addition theseregulatory sequences will generally lead to reduced expression of theenzyme compared with using viral promoters and this will lead to areduced ability to convert prodrug in the target tissue.

Expression and localisation of the prodrug activating enzyme inside thecell has disadvantages. Prodrug design is severely limited by the factthat the prodrug has to be able to cross the cell membrane and enter thecell but not be toxic until it is converted to the drug inside the cellby the activating enzyme. Most prodrugs utilise hydrophilic groups toprevent cell entry and thus reduce cytotoxicity. Prodrug turnover byactivating enzyme produces a less hydrophilic drug which can enter cellsto produce anti-cancer effects. This approach can not be used when theactivating enzyme is expressed inside the cell. Another disadvantage isthat target cells which lack intracellular activating enzyme will bedifficult to attack because they are unable to generate active drug. Toachieve this desirable “bystander activity” (or “neighbouring cellkill”), the active drug will have to be capable of diffusion out of thecell containing activating enzyme to reach target cells which lackenzyme expression. Many active drugs when produced inside a cell will beunable to escape from the cell to achieve this bystander effect.

Modifications of GDEPT have been put forward to overcome some of theproblems described above. Firstly vectors have been described which aresaid to express the activating enzyme on the surface of the target cell(WO 96/03515) by attaching a signal peptide and transmembrane domain tothe activating enzyme. The approach, if viable, would overcome theproblems of having the activating enzyme located inside the cell butwould still have to rely on transcriptionally regulated sequencescapable of being selectively expressed in target cells to restrict cellexpression. As described above there are disadvantages of using suchsequences. Secondly vectors have been described which result insecretion of the enzyme from the target cell (WO 96/16179). In thisapproach the enzyme would be able to diffuse away from its site ofgeneration since it is extracellular and not attached to the cellsurface. Enzyme which has diffused away from the target site would becapable of activating prodrug at non-target sites leading to unwantedtoxicity. To achieve some selectivity it is suggested that enzymeprecursors could be used which are cleaved by pathology associatedproteases to form active enzyme. Some selectivity is likely to beachieved by this approach but its unlikely that activation will onlyoccur at target sites. In addition, once activated, the enzyme willstill be free to diffuse away from the target site and thus suffer fromthe same drawback described above.

For GDEPT approaches, three levels of selectivity can be observed.Firstly, there is selectivity at the cell infection stage such that onlyspecific cell types are targeted. For example cell selectivity can beprovided by the gene delivery system per se. An example of this type ofselectivity is set out in International Patent Application WO 95/26412(UAB Research Foundation) which describes the use of modified adenovirusfiber proteins incorporating cell specific ligands. Other examples ofcell specific targeting include ex vivo gene transfer to specific cellpopulations such as lymphocytes and direct injection of DNA into muscletissue.

The second level of selectivity is control of gene expression after cellinfection such as for example by the use of cell or tissue specificpromoters. If the gene has been delivered to a cell type in a selectivemanner then it is important that a promoter is chosen that is compatiblewith activity in the cell type.

The third level of selectivity can be considered as the selectivity ofthe expressed gene construct. Selectivity at this level has receivedscant attention to date. In International patent application WO 96/16179(Wellcome Foundation) it is suggested that enzyme precursors could beused which are cleaved by pathology associated proteases to form activeenzyme. Some selectivity is likely to be achieved by this approach butit is unlikely that activation will only occur at target sites. Inaddition, once activated, the enzyme will still be free to diffuse awayfrom the target site and thus suffer from the same drawback ofactivating prodrug at non-target sites leading to unwanted toxicity.

There exists a need for more selective GDEPT systems to reduceundesirable effects in normal tissues arising from erroneous prodrugactivation.

The present invention is based on the discovery thatantibody-heterologous enzyme gene constructs can be expressedintracellularly and used in GDEPT systems (or other systems such asAMIRACS—see below) for cell targeting arising from antibody specificityto deliver cell surface available enzyme in a selective manner. Thisapproach may be used optionally in combination with any other suitablespecificity enhancing technique(s) such as targeted cell infectionand/or tissue specific expression.

According to one aspect of the present invention there is provided agene construct encoding a cell targeting antibody and a heterologousenzyme for use as a medicament in a mammalian host wherein the geneconstruct is capable of expressing the antibody and enzyme as aconjugate within a target cell in the mammalian host and wherein theconjugate can leave the cell thereafter for selective localisation at acell surface antigen recognised by the antibody.

According to another aspect of the present invention there is provided agene construct encoding a cell targeting moiety and a heterologousprodrug activating enzyme for use as a medicament in a mammalian hostwherein the gene construct is capable of expressing the cell targetingmoiety and heterologous prodrug activating enzyme as a conjugate withina cell in the mammalian host and wherein the conjugate is directed toleave the cell thereafter for selective localisation at a cell surfaceantigen recognised by the cell targeting moiety.

The “cell targeting moiety” is defined as any polypeptide or fragmentthereof which selectively binds to a particular cell type in a hostthrough recognition of a cell surface antigen. Preferably the celltargeting moiety is an antibody. Cell targeting moieties other thanantibodies include ligands as described for use in Ligand DirectedEnzyme Prodrug Therapy as described in International patent applicationWO 97/26918, Cancer Research Campaign Technology Limited, such as forexample epidermal growth factor, heregulin, c-erbB2 and vascularendothelial growth factor with the latter being preferred.

A “cell targeting antibody” is defined as an antibody or fragmentthereof which selectively binds to a particular cell type in a hostthrough recognition of a cell surface antigen. Preferred cell targetingantibodies are specific for solid tumours, more preferably colorectaltumours, more preferably an anti-CEA antibody, more preferably antibodyA5B7 or 806.077 antibody with 806.077 antibody being especiallypreferred. Hybridoma 806.077 antibody was deposited at the EuropeanCollection of Animal Cell Cultures (ECACC), PHLS Centre for AppliedMicrobiology & Research, Porton Down, Salisbury, Wiltshire SP4 0JG,United Kingdom on Feb. 29, 1996 under accession no. 96022936 inaccordance with the Budapest Treaty.

Antibody A5B7 binds to human carcinoembryonic antigen (CEA) and isparticularly suitable for targeting colorectal carcinoma. A5B7 isavailable from DAKO Ltd., 16 Manor Courtyard, Hughenden Avenue, HighWycombe, Bucks HP13 5RE, England, United Kingdom. In general theantibody (or antibody fragment)—enzyme conjugate should be at leastdivalent, that is to say capable of binding to at least 2 tumourassociated antigens (which may be the same or different). Antibodymolecules may be humanised by known methods such as for example by “CDRgrafting” as disclosed in EP239400 or by grafting complete variableregions from for example a murine antibody onto human constant regions(“chimaeric antibodies”) as disclosed in U.S. Pat. No. 4,816,567.Humanised antibodies may be useful for reducing immunogenicity of anantibody (or antibody fragment). A humanised version of antibody A5B7has been disclosed in International Patent Application WO 92/01059(Celltech).

The hybridoma which produces monoclonal antibody A5B7 was deposited withthe European Collection of Animal Cell Cultures, Division of Biologics,PHLS Centre for Applied Microbiology and Research, Porton Down,Salisbury, Wiltshire SP4 0JG, United Kingdom. The date of deposit wasJul. 14, 1993 and the accession number is No. 93071411. Antibody A5B7may be obtained from the deposited hybridoma using standard techniquesknown in the art such as documented in Fenge C, Fraune E & Schuegerl Kin “Production of Biologicals from Animal Cells in Culture” (Spier R E,Griffiths J R & Meignier B, eds) Butterworth-Heinemann, 1991, 262-265and Anderson B L & Gruenberg M L in “Commercial Production of MonoclonalAntibodies” (Seaver S, ed), Marcel Dekker, 1987, 175-195. The cells mayrequire re-cloning from time to time by limiting dilution in order tomaintain good levels of antibody production.

A “heterologous enzyme” is defined as an enzyme for turning over asubstrate that has been administered to the host and the enzyme is notnaturally present in the relevant compartment of the host. The enzymemay be foreign to the mammalian host (e.g. a bacterial enzyme like CPG2)or it may not naturally occur within the relevant host compartment (e.g.the use of lysozyme as an ADEPT enzyme (for an explanation of ADEPT seebelow) is possible because lysozyme does not occur naturally in thecirculation, see U.S. Pat. No. 5,433,955, Akzo Nev.). The relevant hostcompartment is that part of the mammalian host in which the substrate isdistributed. Preferred enzymes are enzymes suitable for ADEPT or AMIRACS(Antimetabolite with Inactivation of Rescue Agents at Cancer Sites; seeBagshawe (1994) in Cell Biophysics 24/25, 83-91) but ADEPT enzymes arepreferred. Antibody directed enzyme prodrug therapy (ADEPT) is a knowncancer therapeutic approach. ADEPT uses a tumour selective antibodyconjugated to an enzyme. The conjugate is administered to the patient(usually intravenously), allowed to localise at the tumour site(s) andclear from the blood and other normal tissues. A prodrug is thenadministered to the patient which is converted by the enzyme (localisedat the tumour site) into a cytotoxic drug which kills the tumour cells.

In International Patent Application WO 96/20011, published Jul. 4, 1996,we proposed a “reversed polarity” ADEPT system based on mutant humanenzymes having the advantage of low immnunogenicity compared with forexample bacterial enzymes. A particular host enzyme was human pancreaticCPB (see for example, Example 15 [D253K]human CPB & 16 [D253R]human CPBtherein) and prodrugs therefor (see Examples 18 & 19 therein). The hostenzyme is mutated to give a change in mode of interaction between enzymeand prodrug in terms of recognition of substrate compared with thenative host enzyme. In our subsequent International Patent ApplicationNo PCT/GB96/01975 (published Mar. 6, 1997 as WO 97/07796) further workon mutant CPB enzyme/prodrug combinations for ADEPT are described.Preferred enzymes suitable for ADEPT are any one of CPG2 or a reversedpolarity CPB enzyme, for example any one of [D253K]HCPB,[G251T,D253K]HCPB or [A248S,G251T,D253K]HCPB. A preferred form of CPG2is one in which the polypeptide glycosylation sites have been mutated soas to prevent or reduce glycosylation on expression in mammalian cells(see WO 96/03515, Cancer Research Campaign Technology); this givesimproved enzyme activity. Further considerations arise for enzymes suchas CPB which require a pro domain to facilitate correct folding; herethe pro domain can either be expressed as a separately (in trans) orexpressed as part of the fusion protein and subsequently removed.

Large scale purification of CPG2 from Pseudomonas RS-16 was described inSherwood et al (1985), Eur, J. Biochem., 148, 447-453. CPG2 may beobtained from Centre for Applied Microbiology and Research, Porton Down,Salisbury, Wiltshire SP4 0JG, United Kingdom. CPG2 may also be obtainedby recombinant techniques. The nucleotide coding sequence for CPG2 hasbeen published by Minton, N. P. et al., Gene, 31 (1984), 31-38.Expression of the coding sequence has been reported in E.coli (Chambers,S. P. et al., Appl. Microbiol, Biotechnol. (1988), 29, 572-578) and inSaccharomyces cerevisiae (Clarke, L. E. et al., J. Gen Microbiol, (1985)131, 897-904). Total gene synthesis has been described by M. Edwards inAm. Biotech. Lab (1987), 5, 38-44. Expression of heterologous proteinsin E.coli has been reviewed by F. A. O. Marston in DNA Cloning Vol. III,Practical Approach Series, IRL Press (Editor D M Glover), 1987, 59-88.Expression of proteins in yeast has been reviewed in Methods inEnzymology Volume 194, Academic Press 1991, Edited by C. Guthrie and G RFink.

Whilst cancer therapeutic approaches are preferred the invention mayalso be applied to other therapeutic areas as long as a target antigencan be selected and a suitable enzyme/prodrug combination prepared. Forexample, inflammatory diseases such as rheumatiod arthritis may betreated by for example using an antibody selective for synovial cellsfused to an enzyme capable of converting an anti-inflammatory drug inthe form of a prodrug into an anti-inflammatory drug. Use of antibodiesto target rheumatoid arthritis disease has been described in Blakey etal, 1988, Scand. J. Rheumatology, Suppl. 76, 279-287.

A “conjugate” between antibody and enzyme can be a fusion protein(covalent linkage) or the conjugate can be formed by non-covalentbinding between antibody and enzyme formed in situ. Preferably theconjugate is in the form of a fusion protein, more preferably theantibody component of the fusion is at least divalent (for improvedbinding avidity compared with monovalent antibody). Antibody constructslacking an Fc portion are preferred, especially Fab or F(ab′)₂fragments. For CPG2 fusions (or fusions with any non-monomeric enzyme)special considerations apply because CPG2 is a dimeric enzyme and theantibody is preferably divalent thus there exists the potential forundesirable competing dimerisation between two molecular species.Therefore a preferred CPG2 fusion is one in which the fusion protein isformed through linking a C-terminus of an antibody Fab heavy chain (ielacking a hinge region) to an N-terminus of a CPG2 molecule; two ofthese Fab-CPG2 molecules then dimerise through the CPG2 dimerisationdomain to form a (Fab-CPG2)₂ conjugate. For antibody constructs withmonomeric enzymes, F(ab′)₂ fragments are preferred, especially F(ab′)₂fragments having a human IgG3 hinge region. Fusions between antibody andenzyme may optionally be effected through a short peptide linker such asfor example (G₄S)₃. Preferred fusion constructs are those in which theenzyme is fused to the C terminus of the antibody, through the heavy orlight chain thereof with fusion through the antibody heavy chain beingpreferred. Accordingly a preferred gene construct is a gene constructfor use as a medicament as described herein in which the antibody-enzymeCPG2 conjugate is a fusion protein in which the enzyme is fused to the Cterminus of the antibody through the heavy or light chain thereofwhereby dimerisation of the encoded conjugate when expressed can takeplace through a dimerisation domain on CPG2. A more preferred geneconstruct is a gene construct for use as a medicament wherein the fusionprotein is formed through linking a C-terminus of an antibody Fab heavychain to an N-terminus of a CPG2 molecule to form a Fab-CPG2 whereby twoFab-CPG2 molecules when expressed dimerise through CPG2 to form a(Fab-CPG2)₂ conjugate. In another embodiment of the invention apreferred gene construct for use as a medicament is one wherein thecarboxypeptidase is selected from [D253K]HCPB, [G251T,D253K]HCPB or[A248S,G251T,D253K]HCPB.

It is contemplated that should it be possible to obtain a naturalmultimeric enzyme in monomeric form whilst substantially retainingenzymic activity then the monomeric form of the enzyme could be used toform a conjugate of the invention. Similarly, it is contemplated thatshould it be possible to obtain a natural monomeric enzyme in multimericform whilst substantially retaining enzymic activity then the multimericform of the enzyme could be used to form a conjugate of the invention.

The conjugate is directed to leave the cell after expression thereinthrough use of a secretory leader sequence which is cleaved as theconjugate passes through the cell membrane. Preferably the secretoryleader is the secretory leader that occurs naturally with the antibody.

According to another aspect of the present invention there is provideduse of a gene construct encoding a cell targeting antibody and aheterologous enzyme for use for manufacture of a medicament for cancertherapy in a mammalian host wherein the gene construct is capable ofexpressing the antibody and enzyme as a conjugate within a target cellin the mammalian host and wherein the conjugate can leave the cellthereafter for selective localisation at a cell surface antigenrecognised by the antibody.

Any suitable delivery system may be applied to deliver the geneconstruct of the present invention including viral and non-viralsystems. Viral systems include retroviral vectors, adenoviral vectors,adeno-associated virus, vaccinia, herpes simplex virus, HIV, the minutevirus of mice, hepatitis B virus and influenza virus. Non-viral systemsinclude uncomplexed DNA, DNA-liposome complexes, DNA-protein complexesand DNA-coated gold particles.

Retroviral vectors lack immunogenic proteins and there is no preexistinghost immunity but are limited to infecting dividing cells. Retroviruseshave been used in clinical trials (Rosenberg et al., N. Engl. J. Med.,1990, 323: 570-578). Retroviruses are composed of an RNA genome that ispackaged in an envelope derived from host cell membrane and viralproteins. For gene expression, it must first reverse transcribe itspositive-strand RNA genome into double-stranded DNA, which is thenintegrated into the host cell DNA using reverse transcriptase andintegrase protein contained in the retrovirus particle. The integratedprovirus is able to use host cell machinery for gene expression.

Murine leukemia virus is widely used (Miller et al., Methods Enzymol.,1993, 217: 581-599). Retroviral vectors are constructed by removal ofthe gag, pol and env genes to make room for the relevant payload and toeliminate the replicative functions of the virus. Virally encoded mRNAsare eliminated and this removes any potential immune response to thetransduced cells. Genes encoding antibiotic resistance often areincluded as a means of selection. Promoter and enhancer functions alsomay be included for example to provide for tissue-specific expressionafter administration in vivo. Promoter and enhancer functions containedin the long terminal repeat may also be used.

These viruses can be produced only in viral packaging cell lines. Thepackaging cell line may be constructed by stably inserting the deletedviral genes (gag, pol. and env) into the cell such that they reside ondifferent chromosomes to prevent recombination. The packaging cell lineis used to construct a producer cell line that will generatereplication-defective retrovirus containing the relevant payload gene byinserting the recombinant proviral DNA. Plasmid DNA containing the longterminal repeat sequences flanking a small portion of the gag gene thatcontains the encapsidation sequence and the genes of interest istransfected into the packaging cell line using standard techniques forDNA transfer and uptake (electroporation, calcium precipitation, etc.).Variants of this approach have been employed to decrease the likelihoodof production of replication-competent virus (Jolly, D., Cancer GeneTherapy, 1994, 1, 51-64). The host cell range of the virus is determinedby the envelope gene (env) and substitution of env genes with differentcell specificities can be employed. Incorporation of appropriate ligandsinto the envelope protein may also be used for targeting.

Administration may be achieved by any suitable technique e.g. ex vivotransduction of patients' cells, by the direct injection of virus intotissue, and by the administration of the retroviral producer cells.

The ex vivo approach has a disadvantage in that it requires theisolation and maintenance in tissue culture of the patient's cells, butit has the advantage that the extent of gene transfer can be quantifiedreadily and a specific population of cells can be targeted. In addition,a high ratio of viral particles to target cells can be achieved and thusimprove the transduction efficiency (Anderson et al., Hum. Gene Ther.,1990, 1: 331-341; Rosenberg et al., N. Engl. J. Med., 1990, 323:570-578; Culver et al., Hum. Gene Ther., 1991, 2: 107-109 Nienhuis etal., Cancer, 1991, 67: 2700-2704, Anderson et al., Hum. Gene Ther.,1990, 1: 331-341, Grossman et al., Nat. Genet., 1994, 6:335-341, Lotzeel al., Hum. Gene Ther., 1992, 3: 167-177; Lotze, M. T., CellTransplant., 1993, 2: 33-47; Lotze et al., Hum. Gene Ther., 1994, 5:41-55 and U.S. Pat. No. 5,399,346 (Anderson). In some cases directintroduction of virus in vivo is necessary. Retroviruses have been usedto treat brain tumours wherein the ability of a retrovirus to infectonly dividing cells (tumour cells) may be particularly advantageous.

To increase efficiency Oldfield et al., in Hum. Gene Ther., 1993, 4:39-69 proposed the administration of a retrovirus producer cell linedirectly into patients' brain tumours. The murine producer cell wouldsurvive within the brain tumour for a period of days, and would secreteretrovirus capable of transducing the surrounding brain tumour. Viruscarrying the herpes virus thymidine kinase gene renders cellssusceptible to killing by ganciclovir, which is metabolized to acytotoxic compound by thymidine kinase. Patent references onretroviruses are: EP 334301, WO 91/02805 & WO 92/05266 (Viagene) and;U.S. Pat. No. 4,650,764 (University of Wisconsin).

Human adenoviral infections have been described (see Horwitz, M. S., InVirology, 2^(nd) ed. Raven Press, New York, 1990, pp. 1723-1740). Mostadults have prior exposure to adenovirus and have antiadenovirusantibodies. These viruses possess a double-stranded DNA genome, andreplicate independent of host cell division.

Adenoviral vectors possess advantageous properties. They are capable oftransducing a broad spectrum of human tissues and high levels of geneexpression can be obtained in dividing and nondividing cells. Severalroutes of administration can be used including intravenous,intrabiliary, intraperitoneal, intravesicular, intracranial andintrathecal injection, and direct injection of the target organ. Thustargeting based on anatomical boundaries is feasible.

The adenoviral genome encodes about 15 proteins and infection involves afiber protein to bind a cell surface receptor. The penton base of thecapsid engages integrin receptor domains (α₃β₃, or α₃β₅) on the cellsurface resulting in internalization of the virus. Viral DNA enters thenucleus and begins transcription without cell division. Expression andreplication is under control by the E1A and E1B genes (see Horwitz, M.S., In Virology, 2^(nd) ed., 1990, pp. 1723-1740). Removal of E1 genesrenders the virus replication-incompetent. Expression of adenoviralproteins leads to both an immune response which may limit effectivenessparticularly on repeat administration. However, recent approaches inwhich other adenoviral genes such as the E2a gene (which controlsexpression of the fibre knob and a number of other viral proteins) arealso removed from the viral genome may abolish or greatly reduce theexpression of many of these viral proteins in target cells.

Adenoviral serotypes 2 and 5 have been extensively used for vectorconstruction. Bett et al., Proc. Nat. Acad. Sci. U.S.A., 1994, 91:8802-8806 have used an adenoviral type 5 vector system with deletions ofthe E1 and E3 adenoviral genes. The 293 human embryonic kidney cell linehas been engineered to express E1 proteins and can thus transcomplementthe E1-deficient viral genome. The virus can be isolated from 293 cellmedia and purified by limited dilution plaque assays (Graham, F. L. andPrevek, L. In Methods in Molecular Biology: Gene Transfer and ExpressionProtocols, Humana Press 1991, pp. 109-128). Recombinant virus can begrown in 293 cell line cultures and isolated by lysing infected cellsand purification by caesium chloride density centrifugation. One problemof the 293 cells for manufacture of recombinant adenovirus is that dueto additional flanking regions of the E1 genes is that they may giverise to replication competent adenovirus (RCA) during the viral particleproduction. Although this material is only wild type adenovirus and notreplication competent recombinant virus it can have significant effectson the eventual yield of the desired adenoviral material and lead toincreased manufacturing costs, quality control issues for the productionruns and acceptance of batches for clinical use. Alternative cell linessuch as the PER.C6 which have more defined E1 gene integration than 293cells (i.e. contain not flanking viral sequence) have been developedwhich do not allow the recombination events which produce RCA and thushave the potential to overcome above viral production issues.

Adenoviral vectors have the disadvantage of relatively short duration oftransgene expression due to immune system clearance and dilutional lossduring target cell division but improvements in vector design areanticipated. Patent references on adenoviruses are: WO 96/03517(Boehringer); WO 96/13596 (Rhone Poulenc Rorer); WO 95/29993 (Universityof Michigan) and; WO 96/34969 (Canji). Recent advances in adenoviralvectors for cancer gene therapy including the development of strategiesto reduce immunogenicity, chimeric adenoviral/retroviral vectors andconditional (or restricted) replicative recombinant adenoviral systemsare reviewed in Bilbao et al., Exp. Opin. Ther. Patents, 1997, 7(12):1427-1446.

Adeno-associated virus (AAV) (Kotin, R. M., Hum. Gene Ther., 1994, 5:793-801) are single-stranded DNA, nonautonomous parvoviruses able tointegrate into the genome of nondividing cells of a very broad hostrange. AAV has not been shown to be associated with human disease anddoes not elicit an immune response.

AAV has two distinct life cycle phases. Wild-type virus will infect ahost cell, integrate and remain latent. In the presence of adenovirus,the lytic phase of the virus is induced, which is dependent on theexpression of early adenoviral genes, and leads to active virusreplication. The AAV genome is composed of two open reading frames(called rep and cap) flanked by inverted terminal repeat (ITR)sequences. The rep region encodes four proteins which mediate AAVreplication, viral DNA transcription, and endonuclease functions used inhost genome integration. The rep genes are the only AAV sequencesrequired for viral replication. The cap sequence encodes structuralproteins that form the viral capsid. The ITRs contain the viral originsof replication, provide encapsidation signals, and participate in viralDNA integration. Recombinant, replication-defective viruses that havebeen developed for gene therapy lack rep and cap sequences.Replication-defective AAV can be produced by cotransfecting theseparated elements necessary for AAV replication into a permissive 293cell line. Patent references on AAV include: WO 94/13788 (University ofPittsburgh) and U.S. Pat. No. 4,797,368 (US Department of Health).

Gene therapy vectors from pox viruses have been described (Moss, B. andFlexner, C., Annu. Rev. Immunol., 1987, 5: 305-324; Moss, B., InVirology, 1990, pp. 2079-2111). Vaccinia are large, enveloped DNAviruses that replicate in the cytoplasm of infected cells. Nondividingand dividing cells from many different tissues are infected, and geneexpression from a nonintegrated genome is observed. Recombinant viruscan be produced by inserting the transgene into a vaccinia-derivedplasmid and transfecting this DNA into vaccinia-infected cells wherehomologous recombination leads to the virus production. A significantdisadvantage is that it elicits a host immune response to the 150 to 200virally encoded proteins making repeated administration problematic.

The herpes simplex virus is a large, double-stranded DNA virus thatreplicates in the nucleus of infected cells suitable for gene delivery(see Kennedy, P. G. E. and Steiner, I., Q. J. Med., 1993, 86: 697-702).Advantages include a broad host cell range, infection of dividing andnondividing cells, and large sequences of foreign DNA can be insertedinto the viral genome by homologous recombination. Disadvantages are thedifficulty in rendering viral preparations free of replication-competentvirus and a potent immune response. Deletion of the viral thymidinekinase gene renders the virus replication-defective in cells with lowlevels of thymidine kinase. Cells undergoing active cell division (e.g.,tumour cells) possess sufficient thymidine kinase activity to allowreplication. Cantab Pharmaceuticals have a published patent applicationon herpes viruses (WO 92/05263).

A variety of other viruses, including HIV, the minute virus of mice,hepatitis B virus, and influenza virus, have been considered as possiblevectors for gene transfer (see Jolly, D., Cancer Gene Therapy, 1994, 1:51-64).

The use of attenuated Salmonella Typhimurium bacteria which specificallytarget and replicate in hypoxic environments (such as are found in thenecrotic centres of tumours) as gene delivery vehicles for prodrugenzyme based therapy (Tumour Amplified Prodrug Enzyme Therapy known asTAPET™) has also been proposed and is under development by VionPharmaceuticals. This system offers a further gene delivery alternativeto the viral and non-viral delivery approaches discussed below.

Nonviral DNA delivery strategies are also applicable. These DNA deliverysystems include uncomplexed plasmid DNA, DNA-liposome complexes,DNA-protein complexes, and DNA-coated gold particles.

Purified nucleic acid can be injected directly into tissues and resultsin transient gene expression for example in muscle tissue, particularlyeffective in regenerating muscle (Wolff et al., Science, 1990, 247:1465-1468). Davis et al., in Hum. Gene Ther., 1993, 4: 733-740 haspublished on direct injection of DNA into mature muscle. Skeletal andcardiac muscle is generally preferred. Patent references are: WO90/11092, U.S. Pat. No. 5,589,466 (Vical) and WO 97/05185 (biodegradableDNA impregnated hydrogels for injection, Focal).

Plasmid DNA on gold particles can be “fired” into cells (e.g. epidermisor melanoma) using a gene-gun. DNA is coprecipitated onto the goldparticle and then fired using an electric spark or pressurized gas aspropellant (Fynan et al., Proc. Natl. Acad. Sci. U.S.A., 1993, 90:11478-11482). Electroporation has also been used to enable transfer ofDNA into solid tumours using electroporation probes employingmulti-needle arrays and pulsed, rotating electric fields (Nishi et al.,in Cancer Res., 1996, 56:1050-1055). High efficiency gene transfer tosubcutaneous tumours has been claimed with significant cell transfectionenhancement and better distribution characteristics over intra-tumouralinjection procedures.

Liposomes work by surrounding hydrophilic molecules with hydrophobicmolecules to facilitate cell entry. Liposomes are unilamellar ormultilamellar spheres made from lipids. Lipid composition andmanufacturing processes affect liposome structure. Other molecules canbe incorporated into the lipid membranes. Liposomes can be anionic orcationic. Nicolau et al., Proc. Natl. Acad. Sci. U.S.A., 1983, 80:1068-1072 has published on insulin expression from anionic liposomesinjected into rats. Anionic liposomes mainly target thereticuloendothelial cells of the liver, unless otherwise targeted.Molecules can be incorporated into the surface of liposomes to altertheir behavior, for example cell-selective delivery (Wu, G. Y. and Wu,C. H., J. Biol. Chem., 1987, 262: 4429-4432).

Felgner et al., Proc. Nat. Acad. Sci. U.S.A., 1987, 84: 7413-7417 haspublished on cationic liposomes, demonstrated their binding of nucleicacids by electrostatic interactions and shown cell entry. Intravenousinjection of cationic liposomes leads to transgene expression in mostorgans on injection into the afferent blood supply to the organ.Cationic liposomes can be administered by aerosol to target lungepithelium (Brigham et al., Am. J. Med. Sci., 1989, 298: 278-281).Patent references on liposomes are: WO 90/11092, WO 91/17424, WO91/16024, WO 93/14788 (Vical) and; WO 90/01543 (Intracel).

In-Vivo studies with cationic liposome transgene delivery have beenpublished by: Nabel et al., Rev. Hum. Gene Ther., 1994, 5: 79-92; Hydeet al., Nature, 1993, 362: 250-255 and; Conary et al., J. Clin. Invest.,1994, 93: 1834-1840).

Microparticles are being studied as systems for delivery of DNA tophagocytic cells such approaches have been pursued by PangaeaPharmaceuticals in their ENDOSHERE™ DNA microencapsulation deliverysystem which has been used to effect more efficient transduction ofphagocytic cells such as macrophages which ingest the microspheres. Themicrospheres encapsulate plasmid DNA encoding potentially immunogenicpeptides which when expressed lead to peptide display via MHC moleculeson the cell surface which can stimulate immune response against suchpeptides and protein sequences which contain the same epitopes. Thisapproach is presently aimed towards a potential role in anti-tumour andpathogen vaccine development but may have other possible gene therapyapplications.

In the same way as synthetic polymers have been used to package DNAnatural viral coat proteins which are capable of homogeneousself-assembly into Virus-like particles (VLPs) have been used to packageDNA. The major structural coat protein VP1 of human polyoma virus can beexpressed as a recombinant protein and is able to package plasmid DNAduring self-assembly into a VLP. The resulting particles can besubsequently used to transduce various cell lines, while preliminarystudies show little immunogenic response to such VP1 based VLPs. Suchsystems may offer an attractive intermediate between synthetic polymernon-viral vectors and the alternative viral delivery systems since theymay offer combined advantages e.g. simplicity of production and highlevel transduction efficiency.

To improve the specificity of gene delivery and expression thetherapeutic gene the inclusion of targeting elements into the deliveryvehicles and the use of regulatory expression elements have beeninvestigated both singlulary and in combination in many of thepreviously described delivery systems.

Improvements in DNA vectors have also been made and are likelyapplicable to all of the non-viral delivery systems. These include theuse of supercoiled minicircles reported by RPR Gencell (which do nothave bacterial origins of replication nor antibiotic resistance genesand thus are potentially safer as they exhibit a high level ofbiological containment), episomal expression vectors as developed byCopernicus Gene Systems Inc (replicating episomal expression systemswhere the plasmid amplifies within the nucleus but outside thechromosome and thus avoids genome integration events) and T7 systems asdeveloped by Progenitor (a strictly a cytoplasmic expression vector inwhich the vector itself expresses phage T7 RNA polymerase and thetherapeutic gene is driven from a second T7 promoter, using thepolymerase generated by the first promoter). Other, more generalimprovements to DNA vector technology include use of cis-acting elementsto effect high levels of expression (Vical), sequences derived fromalphoid repeat DNA to supply once-per-cell-cycle replication and nucleartargeting sequences (from EBNA-1 gene (Calos at Stanford, withMegabios); SV40 early promoter/enhancer or peptide sequences attached tothe DNA).

Targeting systems based on cell receptor recognition by ligand linked toDNA have been described by Michael, S. I. and Curiel, D. T., GeneTherapy, 1994, 1: 223-232. Using the ligand recognized by such areceptor the DNA becomes selectively bound and internalized into thetarget cell (Wu, G. Y. and Wu, C. H., J. Biol. Chem., 1987, 262:4429-4432). Poly-L-lysine (PLL), a polycation, has been used to couple avariety of protein ligands to DNA by chemical cross-linking methods. DNAis electrostatically bound to PLL-ligand molecules. Targetting systemshave been published by Zenke et al., Proc. Nat. Acad. Sci. U.S.A., 1990,87: 3655-3659 using transferrin receptor; Wu, G. Y. and Wu, C. H., J.Biol. Chem., 1987. 262: 4429-4432 using the asialoorosomucoid receptor,and Batra et al., Gene Therapy, 1994, 1: 255-260, using cell surfacecarbohydrates. Agents such as chloroquine or co-localised adenovirus canbe used to reduce DNA degradation in the lysosomes (see Fisher, K. J.and Wilson, J. M., Biochem. J., 1994, 299, 49-58). Cristiano et al.,Proc. Natl. Acad. Sci. U.S.A., 1993, 90: 11548-11552 has constructedadenovirus-DNA-ligand complexes. Patent references on receptor mediatedendocytosis are: WO 92/05250 (asialoglycoproteins, University ofConnecticut) and U.S. Pat. No. 5,354,844 (transferrin receptor,Boehringer).

DNA and ligand can be coated over the surface of the adenovirus tocreate a coated adenovirus (Fisher, K. J. and Wilson, J. M., Biochem.J., 1994, 299, 49-58). However the presence of two receptor pathways forDNA entry (ligand receptor and adenovirus receptor) reduces thespecificity of this delivery system but the adenovirus receptor pathwaycan be eliminated by using an antibody against adenovirus fiber proteinas the means for linkage to DNA (Michael, S. I. and Curiel, D. T., GeneTherapy, 1994, 1: 223-232). Use of purified endosomalytic proteinsrather than intact adenovirus particles is another option (Seth, P., J.Virol., 1994, 68: 1204-1206).

The expression of a gene construct of the invention at its target siteis preferably under the control of a transcriptional regulatory sequence(TRS). A TRS is a promoter optionally combined with an enhancer and/oran control element such as a genetic switch described below.

One example of a TRS is a “genetic switch” that may be employed tocontrol expression of a gene construct of the invention once it has beendelivered to a target cell. Control of gene expression in highereucaryotic cells by procaryotic regulatory elements (which are preferredfor the present invention) has been reviewed by Gossen et al in TIBS,Dec. 18, 1993, 471-475. Suitable systems include the E. coli lac operonand the especially preferred E. coli tetracycline resistance operon.References on the tetracycline system include Gossen et al (1995)Science 268, 1766; Damke et al (1995) Methods in Enzymology 257,Academic Press; Yin et al (1996) Anal. Biochem. 235, 195 and; U.S. Pat.Nos. 5,464,758, 5,589,362, WO 96/01313 and WO 94/29442 (Bujard). Anecdysone based switch (International Patent Appln No.PCT/GB96/01195,Publication No. WO 96/37609, Zeneca) is another option. Other optionsare listed below. Connaught Laboratories (WO-93/20218) describe asynthetic inducible eukaryotic promoter comprising at least twodifferent classes of inducible elements. Rhone-Poulenc Rorer (WO96/30512) describe a tetracycline-related application for a conditionalgene expression system. Ariad (WO 94/18317) describes a proteindimerisation based system for which in vivo activity has been shown.Bert O'Malley of the Baylor College of Medicine (WO 93/23431, U.S. Pat.No. 5,364,791, WO 97/10337) describes a molecular switch based on theuse of a modified steroid receptor. The Whitehead Institute have anNF-KB inducible gene expression system (WO 88/05083). Batelle Memorialhave described a stress inducible promoter (European patent EP 263908).

Examples of TRSs which are independent of cell type include thefollowing: cytomegalovirus promoter/enhancer, SV40 promoter/enhancer andretroviral long terminal repeat promoter/enhancer. Examples of TRSswhich are dependent on cell type (to give an additional degree oftargeting) include the following promoters: carcinoembryonic antigen(CEA) for targeting colorectal, lung and breast; alpha-foetoprotein(AFP) for targeting transformed hepatocytes; tyrosine hydroxylase,choline acetyl transferase or neurone specific enolase for targetingneuroblastomas; insulin for targeting pancreas and; glial fibro acidicprotein for targeting glioblastomas. Some oncogenes may also be usedwhich are selectively expressed in some tumours e.g. HER-2/neu orc-erbB2 in breast and N-myc in neuroblastoma.

SUMMARY OF INVENTION

Accordingly, a preferred gene construct for use as a medicament is aconstruct comprising a transcriptional regulatory sequence whichcomprises a promoter and a control element which is a genetic switch tocontrol expression of the gene construct. A preferred genetic switchcontrol element is regulated by presence of tetracycline or ecdysone. Apreferred promoter is dependent on cell type and is selected from thefollowing promoters: carcinoembryonic antigen (CEA); alpha-foetoprotein(AFP); tyrosine hydroxylase; choline acetyl transferase; neuronespecific enolase; insulin; glial fibro acidic protein; HER-2/neu;c-erbB2; and N-myc. Preferably the gene construct for use as amedicament described herein is packaged within an adenovirus fordelivery to the mammalian host. A general review of targeted genetherapy is given in Douglas et al., Tumor Targeting, 1995, 1: 67-84.

The antibody encoded by the gene construct of the invention may be anyform of antibody construct such as for example F(ab′)₂; F(ab′), Fab, Fv,single chain Fv & V-min. Any suitable antibody construct iscontemplated, for example a recently described antibody fragment is“L-F(ab)₂” as described by Zapata (1995) in Protein Engineering, 8,1057-1062. Disulphide bonded Fvs are also contemplated. For constructsbased on CPG2 enzyme, Fab fragment constructs dimerised through enzymedimerisation are preferred. Non-human antibodies may be humanised foruse in humans to reduce host immune responses. A humanized antibody,related fragment or antibody binding structure is a polypeptide composedlargely of a structural framework of human derived immunoglobulinsequences supporting non human derived amino acid sequences in andaround the antigen binding site (complementarity determining regions orCDRs). Appropriate methodology has been described for example in detailin WO 91/09967, EP 0328404 and Queen et al. Proc Natl Acad Sci 86 10029,Mountain and Adair (1989) Biotechnology and Genetic Engineering Reviews10, 1 (1992) although alternative methods of humanisation are alsocontemplated such as antibody veneering of surface residues (EP 519596,Merck/NIH, Padlan et al).

According to another aspect of the present invention there is provided amatched two component system designed for use in a mammalian host inwhich the components comprise:

(i) a first component that comprises a gene construct encoding a celltargeting antibody and a heterologous prodrug activating enzyme whereinthe gene construct is capable of expressing the antibody and enzyme as aconjugate within a target cell in the mammalian host and wherein theconjugate can leave the cell thereafter for selective localisation at acell surface antigen recognised by the antibody and;

(ii) a second component that comprises a prodrug which can be convertedinto an active drug by the enzyme.

Antibody directed enzyme prodrug therapy (ADEPT) is a known cancertherapeutic approach. ADEPT uses a tumour selective antibody conjugatedto an enzyme. The conjugate is administered to the patient (usuallyintravenously), allowed to localise at the tumour site(s) and clear fromthe blood and other normal tissues. A prodrug is then administered tothe patient which is converted by the enzyme (localised at the tumoursite) into a cytotoxic drug which kills the tumour cells.

The present invention can be applied to any ADEPT system. Suitableexamples of ADEPT systems include those based on any of the followingenzymes: carboxypeptidase G2; carboxypeptidase A; aminopeptidase;alkaline phosphatase; glycosidases; β-glucuronidase; penicillin amidase;β-lactamase; cytosine deaminase; nitroreductase; or mutant host enzymesincluding carboxypeptidase A, carboxypeptidase B, and ribonuclease.Suitable references on ADEPT systems include Melton R G (1996) in J.National Cancer Institute 88, 1; Niculescu-Duvaz I (1995) in CurrentMedicinal Chemistry 2, 687; Knox R J (1995) in Clin. Immunother. 3, 136;WO 88/07378 (CRCT); Blakey et al., Cancer Res. 56, 3287-92, 1996; U.S.Pat. No. 5,587,161 (CRCT and Zeneca); WO 97/07769 (Zeneca); and WO95/13095 (Wellcome). The heterologous enzyme may be in the form of acatalytic antibody; see for example EP 745673 (Zeneca). A reviewarticles on ADEPT systems include Hay & Denny (1996), Drugs of theFuture, 21(9), 917-931 and Blakey (1997), Exp. Opin. Ther. Patents,7(9), 965-977.

A preferred matched two component system is one in which: the firstcomponent comprises a gene encoding the heterologous enzyme CPG2; andthe second component prodrug is selected fromN-(4-[N,N-bis(2-iodoethyl)amino]-phenoxycarbonyl)-L-glutamic acid,N-(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic-gamma-(3,5-dicarboxy)anilideor N-(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic acidor a pharmaceutically acceptable salt thereof. Preferred prodrugs foruse with CPG2 are described in the following US patents from ZenecaLimited and Cancer Research Campaign Technology Limited: U.S. Pat. Nos.5,714,148, 5,405,990, 5,587,161 & 5,660,829.

In another aspect of the invention there is provided a method for thedelivery of a cytotoxic drug to a site which comprises administering toa host a first component that comprises a gene construct as definedherein; followed by administration to the host of a second componentthat comprises a prodrug which can be converted into a cytotoxic drug bythe heterologous enzyme encoded by the first component. A preferredmethod for delivery of a cytotoxic drug to a site is one in which thefirst component comprises a gene encoding the heterologous enzyme CPG2;and the second component prodrug is selected fromN-(4-[N,N-bis(2-iodoethyl)amino]phenoxycarbonyl)-L-glutamic acid,N-(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic-gamma-(3,5-dicarboxy)anilideor N-(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic acidor a pharmaceutically acceptable salt thereof.

Abbreviations used herein include:

AAV Adeno-associated virus ADEPT antibody directed enzyme prodrugtherapy AFP alpha-foetoprotein AMIRACS Antimetabolite with Inactivationof Rescue Agents at Cancer Sites APS ammonium persulfate b.p. base pairBPB bromophenol blue CDRs complementarity determining regions CEACarcinoma Embryonic Antigen CL constant domain of antibody light chainCPB carboxypeptidase B CPG2 carboxypeptidase G2 CPG2 R6 carboxypeptidaseG2 mutated to prevent glycosylation on expression in eucaryotic cells,see Example 1d DAB substrate 3,3′-diaminobenzidine tetrahydrochlorideDEPC diethylpyrocarbonate DMEM Dulbecco's modified Eagle's medium ECACCEuropean Collection of Animal Cell Cultures EIA enzyme immunoassay ELISAenzyme linked immunosorbent assay FAS folinic acid supplemented FCSfoetal calf serum Fd heavy chain of Fab, Fab′ or F(ab′)₂ optionallycontaining a hinge GDEPT gene directed enzyme prodrug therapy HAMA HumanAnti Mouse Antibody HCPB human carboxypeptidase B, preferably pancreatichinge (of an IgG) a short proline rich peptide which contains thecysteines that bridge the 2 heavy chains HRPO or HRP horse radishperoxidase IRES internal ribosome entry site MTX methotrexate NCAnon-specific cross reacting antigen NCIMB National Collections ofIndustrial and Marine Bacteria OPD ortho-phenylenediamine PBS phosphatebuffered saline PCR polymerase chain reaction PGPN-(4-[N,N-bis(2-chloroethyl)amino]- phenoxycarbonyl)-L-glutamic acidpreproCPB proCPB with an N-terminal leader sequence proCPB CPB with itsN-terminal pro domain scFv single chain Fv SDS-PAGE sodium dodecylsulphate - polyacrylamide gel electrophoresis SSC salt sodium citrateTBS Tris-buffered Saline Temed N,N,N′,N′-tetramethylethylenediamine TFAtrifluoroacetic acid TRS transcriptional regulatory sequence VDEPTvirus-directed enzyme prodrug therapy VH variable region of the heavyantibody chain VK variable region of the light antibody chain

In this specification conservative amino acid analogues of specificamino acid sequences are contemplated which retain the relevantbiological properties of the component of the invention but differ insequence by one or more conservative amino acid substitutions, deletionsor additions. However the specifically listed amino acid sequences arepreferred. Typical conservative amino acid substitutions are tabulatedbelow.

Exemplary Preferred Original Substitutions Substitutions Ala (A) Val;Leu; Ile Val Arg (R) Lys; Gln; Asn Lys Asn (N) Gln; His; Lys; Arg GlnAsp (D) Glu Glu Cys (C) Ser Ser Gln (Q) Asn Asn Glu (E) Asp Asp Gly (G)Pro Pro His (H) Asn; Gln; Lys; Arg Arg Ile (I) Leu; Val; Met; Ala; Phe;Leu Norleucine Leu (L) Norleucine; Ile; Val; Ile Met; Ala; Phe Lys (K)Arg; Gln; Asn Arg Met (M) Leu; Phe; Ile Leu Phe (F) Leu; Val; Ile; AlaLeu Pro (P) Gly Gly Ser (S) Thr Thr Thr (T) Ser Ser Trp (W) Tyr Tyr Tyr(Y) Trp; Phe; Thr; Ser Phe Val (V) Ile; Leu; Met; Phe; Leu Ala;Norleucine

Amino acid nomenclature is set out below.

Alanine Ala A Arginine Arg R Asparagine Asn N Aspartic Acid Asp DCysteine Cys C Glutamic Acid Glu E Glutamine Gln Q Glycine Gly GHistidine His H Isoleucine Ile I Leucine Leu L Lysine Lys K MethionineMet M Phenylalanine Phe F Proline Pro P Serine Ser S Threonine Thr TTryptophan Trp W Tyrosine Tyr Y Valine Val V Any Amino Acid Xaa X

In this specification nucleic acid variations (deletions, substitutionsand additions) of specific nucleic acid sequences are contemplated whichretain which the ability to hybridise under stringent conditions to thespecific sequence in question. Stringent conditions are defined as6×SSC, 0.1% SDS at 60° for 5 minutes. However specifically listednucleic acid sequences are preferred. It is contemplated that chemicalanalogues of natural nucleic acid structures such as “peptide nucleicacid” (PNA) may be an acceptable equivalent, particularly for purposesthat do not require translation into protein (Wittung (1994) Nature 368,561).

The invention will now be illustrated by reference to the followingnon-limiting Examples. Temperatures are in degrees Celsius.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 shows a representation of the fusion gene construct comprisingA5B7 antibody heavy chain Fd fragment linked at its C-terminus via aflexible (G₄S)₃ peptide linker to the N-terminus of CPG2 polypeptide. SSrepresents the signal sequence. L represents a linker sequence. CPG2/R6represents CPG2 with its glycosylation sites nullified through mutationas explained in the text.

FIG. 2a shows a representation of (Fab-CPG2)₂ fusion protein withdimerisation taking place through non-covalent bonding between two CPG2molecules.

FIG. 2b shows a representation of a F(ab′)₂ antibody fragment.

FIG. 3 shows a cell based ELISA assay of secreted fusion proteinmaterial. Only the CEA positive line has increased levels of bindingwith increasing amounts of added fusion protein whereas the CEA negativecell line has only constant background binding levels throughout. Thevertical axis represents optical density readings measured at 490 nm andthe horizontal axis the amount of added fusion protein measured in ng ofprotein. The graph shows data obtained from an experiment where a numberof cell lines and a negative control (no cells) were incubated withincreasing amounts of fusion protein using the cell assay described inExample 6. The results show that only the LoVo (CEA positive) cell lineshowed an increasing OD490 reading corresponding to increasing amountsof addes fusion protein. All other cell lines (CEA negative) and thecontrol (no cells) showed only a background OD490 nm reading which didnot increase with the addition of fusion protein. These results provideevidence that the fusion protein material binds specifically to a CEApositive cell line in a dose dependant manner and do not bind to CEAnegative lines.

FIG. 4 shows retention of secreted fusion protein to recombinant LoVotumour cells. The vertical axis represents optical density readingsmeasured at 490 nm and the horizontal axis the amount of added anti-CEAantibody (IIE6) measured in ng/ml of protein. The experiment wasperformed as described in Example 7 using three different cell lines,recombinant LoVo and Colo320DM lines (which themselves secrete fusionprotein) and a contol parental LoVo line which does not secrete fusionprotein. Firstly, the cell lines were fixed and washed to remove theexisting supernatant and any unbound material after which increasingconcentrations of the anti-CEA antibody (IIE6) were added to the fixedcells. The assay was developed as described in the text to determine thelevel of retention of any secreted material and whether further addedantibody would increase the signal. The results showed that whithoutadded anti-CEA antibody the control parental Lovo line exhibited only abackgroundOD490 nm reading (as expected) whereas the recombinant LoVoline gave a very strong OD 490 nm reading indicating that the fusionprotein material was being retained on the CEA positive LoVo cells. TheCEA negative recombinant Colo320DM gave a much weaker reading than theLoVo cells but the signal was higher than background (possibly due tonone fixing of the secreted antibody early in the assay method).Increasing concentrations of the anti-CEA antibody (IIE6) added to thefixed cells showed a dose related response in the case of the parentalLoVo cells thus indicating that they are CEA positive and can bind CEAbinding material (such as the fusion protein if present or added). Therecombinant Colo320DM and LoVo cells showed little increase in overallOD490 signal with increasing amounts of added antibody with theexception of the LoVo cells which appear to show a slight response atthe highest antibody dose. Since the recombinant Colo320DM are CEAnegative no increase in signal due to anti-CEA antibody the results forthese cells would be expected. In the case of the recombinant LoVo cellsthe addition signal due the amounts of antibody added in this assay maybe swamped except at the highest dose due to the relative strength ofthe original signal.

FIG. 5 shows retention of secreted fusion protein to recombinant LoVotumour cells. The vertical axis represents median tumour volume (cm³)and the horizontal axis time in day after dosing of the prodrug. Theexperiment was performed as described in Example 12 using 60 mg/kg dosesof prodrug. The results show that the control GAD(c) (none prodrugtreated) tumours grew to 6 times their initial size by 11 days(post-dose day) at which time the tumours were harvested. The prodrugtreated tumours GAD(d) show a significantly slower growth rate and byday 16 (post-dose day) have only reached 3 times their initial size.This data indicates at least an 11 day tumour growth delay.

In the Examples below, unless otherwise stated, the followingmethodology and materials have been applied.

DNA is recovered and purified by use of GENECLEAN™ II kit (StratechScientific Ltd. or Bio 101 Inc.). The kit contains: 1) 6M sodium iodide;2) a concentrated solution of sodium chloride, Tris and EDTA for makinga sodium chloride/ethanol/water wash; 3) Glassmilk—a 1.5 ml vialcontaining 1.25 ml of a suspension of a specially formulated silicamatrix in water. This is a technique for DNA purification based on themethod of Vogelstein and Gillespie published in Proceedings of theNational Academy of Sciences USA (1979) Vol 76, p 615. Briefly, the kitprocedure is as follows. To 1 volume of gel slice is added 3 volumes ofsodium iodide solution from the kit. The agarose is melted by heatingthe mix at 55° for 10 min then Glassmilk (5-10 ml) is added, mixed welland left to stand for 10 min at ambient temperature. The glassmilk isspun down and washed 3 times with NEW WASH™ (0.5 ml) from the kit. Thewash buffer is removed from the Glassmilk and DNA is eluted byincubating the Glassmilk with water (5-10 ml) at 55° for 5-10 min. Theaqueous supernatant containing the eluted DNA is recovered bycentrifugation. The elution step can be repeated and supernatantspooled.

Competent E. coli DH5α cells were obtained from Life Technologies Ltd(MAX™ efficiency DH5α competent cells).

Mini-preparations of double stranded plasmid DNA were made using theRPM™ DNA preparation kit from Bio 101 Inc. (cat. No 2070-400) or asimilar product—the kit contains alkaline lysis solution to liberateplasmid DNA from bacterial cells and glassmilk in a spinfilter to adsorbliberated DNA which is then eluted with sterile water or 10 mM Tris-HCl,1 mM EDTA, pH 7.5.

The standard PCR reaction contains 100 ng of plasmid DNA (except wherestated), 5 μl dNTPs (2.5 mM), 5 μl 10×Enzyme buffer (500 mM KCl, 100 mMTris pH 8.3), 15 mM MgCl₂ and 0.1% gelatin), 1 μl of a 25 pM/μl stocksolution of each primer, 0.5 μl thermostable DNA polymerase and water toobtain a volume of 50 μl. Standard PCR conditions were: 15 cycles of PCRat 94° for 90 s; 55° for 60 s; 72° for 120 s, ending the last cycle witha further 72° for 10 min incubation.

AMPLITAQ™, available from Perkin-Elmer Cetus, is used as the source ofthermostable DNA polymerase.

General molecular biology procedures can be followed from any of themethods described in “Molecular Cloning—A Laboratory Manual” SecondEdition, Sambrook, Fritsch and Maniatis (Cold Spring Harbor Laboratory,1989).

Serum free medium is OPTIMEM™ I Reduced Serum Medium, GibcoBRL Cat. No.31985. This is a modification of Eagle's Minimum Essential Mediumbuffered with Hepes and sodium bicarbonate, supplemented withhypoxanthine, thymidine, sodium pyruvate, L-glutamine, trace elementsand growth factors.

LIPOFECTIN™ Reagent (GibcoBRL Cat. No. 18292-011) is a 1:1 (w/w)liposome formulation of the cationic lipidN-[1-(2,3-dioleyloxy)propyl]-n,n,n-trimethylammonium chloride (DOTMA)and dioleoyl phosphatidylethanolamine (DOPE) in membrane filtered water.It binds spontaneously with DNA to form a lipid-DNA complex—see Felgneret al. in Proc. Natl. Acad. Sci. USA (1987) 84, 7431.

G418 (sulphate) is GENETICIN™, GibcoBRL Cat. No 11811, an aminoglycosideantibiotic related to gentamicin used as a selecting agent in moleculargenetic experiments;

For the CEA ELISA each well of a 96 well immunoplate (NUNC MAXISORB™)was coated with 50 ng CEA in 50 mM carbonate/bicarbonate coating bufferpH9.6 (buffer capsules—Sigma C3041) and incubated at 4° overnight. Theplate was washed three times with PBS-TWEEN™ (PBS+0.05% TWEEN™ 20) andthen blocked 150 μl per well of 1% BSA in PBS-TWEEN™ for 1 hour at roomtemperature. The plate was washed three times with PBS-TWEEN™, 100 μl oftest sample added per well and incubated at room temperature for 2hours. The plate was washed three times with PBS-TWEEN™, 100 μl per wellof a 1/500 dilution of HRPO-labelled goat anti-human kappa antibody(Sigma A 7164) was added in 1% BSA in PBS-TWEEN™ and incubated at roomtemperature on a rocking platform for at least 1 hour. The plate waswashed three times with PBS-TWEEN™ and then once more with PBS. Todetect binding, add 100 μl per well of developing solution (one capsuleof phosphate-citrate buffer—Sigma P4922—dissolved in 100 ml H₂O to whichis added one 30 mg tablet o-phenylenediamine dihydrochloride—SigmaP8412) and incubated for up to 15 minutes. The reaction was stopped byadding 75 μl 2M H₂SO₄, and absorbance read at 490 nm.

The CEA ELISA using an anti CPG2 reporter antibody was essentially asabove but instead of HRPO-labelled goat anti-human kappa antibody an1/1000 dil. of a rabbit anti-CPG2 polyclonal sera was added, in 1% BSAin PBS-TWEEN™ and incubated at room temperature on a rocking platformfor at 2 hours. The plate was washed three times with PBS-TWEEN™. A1/2000 dilution of a goat anti-rabbit HRPO labelled antibody (SigmaA-6154) was then added and incubated at room temperature on a rockingplatform for 1 hour, the plate was washed three times with PBS-TWEEN™and once with PBS. To detect binding add 100 μl per well developingsolution (one capsule of phosphate-citrate buffer—Sigma P4922—dissolvedin 100 ml H₂O to which is added one 30 mg tablet o-phenylenediaminedihydrochloride—Sigma P8412) and incubated for up to 15 minutes. Thereaction was stopped by adding 75 μl 2M H₂SO₄, and absorbance read at490 nm.

Western blot analysis of transfection supernatants was performed asfollows. 10% mini gels for analysis of fusion protein transfections wereprepared using a suitable mini gel system (HOEFER MIGHTY SMALL™). 10%running gel is: 20 ml acrylamide, 6 ml 10×running gel buffer; 34 ml H₂O;300 ml 20% SDS; 600 μl APS; 30 μl Temed. Running gel buffer 10× is 3.75M Tris pH 8.6. 6% stacking gel is: 9 ml acrylamide; 4.5 ml 10×stackinggel buffer; 31.5 ml H₂O; 225 μl 20% SDS 450 μl 10% APS; 24 μl Temed).Stacking gel buffer 10× is 1.25 M Tris pH 6.8. Electrophoresis buffer 5×for SDS/PAGE is 249 mM Tris, 799 mM glycine, 0.6% w/v SDS (pH notadjusted).

Preparation of samples 2×Laemmli buffer is 0.125 M Tris; 4% SDS; 30%glycerol; 4 M urea; 0.002% BPB optionally containing 5%β-mercaptoethanol. Supernatants: 25 μl sample+25 μl 2×Laemmli buffer; 40μl loaded. Standards F(ab′)₂ and CPG2: 2 μl of 10 ng/ml of standard; 8μl of H₂O; 10 μl 2×Laemmli buffer (−mercaptoethanol); 20 μl loaded.Molecular weight markers (Amersham RAINBOW™): 8 μl sample; 8 μl2×Laemmli buffer (+mercaptoethanol): 16 μl loaded. Running conditions:30 milliamps until dye front at bottom of gel(approx. 1 hour). Blotting:using a semi dry blotter (LKB) onto nitrocellulose membrane.Milliamps=0.7×cm², for 45 minutes. Blocking: 5% dried skimmed milk inPBS-TWEEN™ for 40 minutes.

Detection of F(ab′)₂ :goat anti human kappa light chain HRPO labelledantibody, 1/2500 in 0.5% dried skimmed milk in PBS-TWEEN™ incubatedovernight.

Detection of CPG2: mouse anti-CPG2 monoclonal (1/2000 in 0.5% driedskimmed milk in PBS-TWEEN™ incubated overnight; goat anti mouse kappalight chain HRPO labelled antibody—Sigma 674301—(1/10000 in 0.5% driedskimmed milk in PBS-TWEEN™) incubated for at least 2 hours.

Development of Blot: Chemiluminescence detection of HRPO based onluminol substrate in the presence of enhancer was used (PierceSUPERSIGNAL™ Substrate). Substrate working solution was prepared asfollows: recommended volume: 0.125 ml/cm² of blot surface. Mix equalvolumes of luminol/enhancer solution and stable peroxide solution,incubate blot with working solution for 5-10 minutes, remove solutionand place blot in a membrane protector and expose againstautoradiographic film (usually between 30 seconds and 5 minutes).

Microorganism deposits: Plasmid pNG3-Vkss-HuCk was deposited at TheNational Collections of Industrial and Marine Bacteria (NCIMB), 23 StMachar Drive, Aberdeen AB2 1RY, Scotland, United Kingdom on Apr. 11,1996 under deposit reference number NCIMB 40798 in accordance with theBudapest Treaty. Plasmid pNG4-VHss-HuIgG2CH1′ was deposited at TheNational Collections of Industrial and Marine Bacteria (NCIMB), 23 StMachar Drive, Aberdeen AB2 1RY, Scotland, United Kingdom on Apr. 11,1996 under deposit reference number NCIMB 40797 in accordance with theBudapest Treaty. Plasmid pNG3-Vkss-HuCk-NEO was deposited at TheNational Collections of Industrial and Marine Bacteria (NCIMB), 23 StMachar Drive, Aberdeen AB2 1RY, Scotland, United Kingdom on Apr. 11,1996 under deposit reference number NCIMB 40799 in accordance with theBudapest Treaty. Plasmid pICI266 was deposited under accession numberNCIMB 40589 on Oct. 11, 1993 under the Budapest Treaty at the NationalCollections of Industrial and Marine Bacteria Limited (NCIMB), 23 St.Machar Drive, Aberdeen, AB2 1RY, Scotland, U.K.

Typsinisation: Trypsin EDTA (Gibco BRL 45300-019) and Hanks balancedsalt solution (HBSS; Gibco BRL 14170-088) were pre-warmed in a 37°waterbath. Existing media was removed from cultures and replaced with avolume of HBSS (which is half the previous media volume) and the layerof cells washed by carefully rocking the plate or flask so as to removeany residual serum containing media. The HBSS was removed and a volumeof Trypsin solution (which is one quarter of the original media volume)added, with gently rocking the flask to ensure the cell layer wascompletely covered and left for 5 min. Trypsin was inactivated byaddition of of the appropriate normal culture media (2×the volume of thetrypsin solution). The cell suspension was then either cell counted orfurther diluted for continued culture depending on the procedure to beperformed.

Heat Inactivation of Foetal Calf Serum (FCS): FCS (Viralex A15-651accredited batch—Non European) was stored at −20°. For use, the serumwas completely thawed at 4° overnight. The next day, the serum wasincubated for 15 min in a 37° waterbath and then transferred to a 56°waterbath for 15 min. The serum was removed and allowed to cool to roomtemperature before it was split in to 50 ml aliquots and stored at −20°C.

Normal DMEM Media (using Gibco BRL components): To 500 ml DMEM(41966-086) add 12.5ml Hepes (15630-056); 5 ml NEAA (11140-035); 5 mlpen/strep (10378-016); and 50 ml heat inactivated FCS.

FAS Media (using Gibco BRL components unless stated otherwise): 490 mlDMEM (41966-086); 12.5 ml Hepes (15630-056); 5 ml non-essential aminoacids (11140-035); 5 ml pen/strep (10378-016); 5 ml vitamins(11120-037); 5ml basal amino acids (51051-019); Folinic Acid (SigmaF8259) to a final media concentation of 10 μg/ml ; 50 ml heatinactivated FCS; 5 ml dNTP mix; and G418 50 mg/ml stock solution (toproduce the appropriate selection concentration).

dNTP mix: 35 mg G (Sigma G6264), 35 mg C (Sigma C4654), 35 mg A (SigmaA4036), 35 mg U (SigmaU3003), 125 mg T (Sigma T1895) were dissolved in100 ml water, filter sterilised, and stored at −20°.

G418 Selection: for LoVo cells (ATCC CCL 229) selection was performed at1.25 mg/ml, for HCT116 (ATCC CCL 247) cells and for Colo320DM (ATCC CCL220) cells selection was performed at 1.5 mg/ml unless stated otherwise.

BLUESCRIPT™ vectors were obtained from Stratagene Cloning Systems.

Tet-On gene expression vectors were obtained from Clontech (Palo Alto,Calif.) cat. no. K1621-1.

Unless stated otherwise or apparent from the context used, antibody-CPG2fusion constructs referred to in the Examples use mutated CPG2 toprevent glycosylation.

EXAMPLE 1

Construction of an (A5B7 Fab-CPG2)₂ Fusion Protein

The construction of a (A5B7 Fab-CPG2)2 enzyme fusion was planned withthe aim of obtaining a bivalent human carcinoembryonic antigen (CEA)binding molecule which also exhibits CPG2 enzyme activity. To this endthe initial construct was designed to contain an A5B7 antibody heavychain Fd fragment linked at its C-terminus via a flexible (G₄S)₃ peptidelinker to the N-terminus of the CPG2 polypeptide (FIG. 1).

The antibody ASB7 binds to human carcinoembryonic antigen (CEA) and isparticularly suitable for targeting colorectal carcinoma or other CEAantigen bearing cells (the importance of CEA as a cancer associatedantigen is reviewed by Shively, J. E. and Beatty, J. D. in “CRC CriticalReviews in Oncology/Hematology”, vol 2, p355-399, 1994). The CPG2 enzymeis naturally dimeric in nature, consisting of two associated identicalpolypeptide subunits. Each subunit of this molecular dimer consists of alarger catalytic domain and a second smaller domain that forms the dimerinterface.

In general, antibody (or antibody fragment)-enzyme conjugate or fusionproteins should be at least divalent, that is to say capable of bindingat least 2 tumour associated antigens (which may be the same ordifferent). In the case of the (A5B7 Fab-CPG2)₂ fusion protein,dimerisation of the enzyme component takes place after expression, aswith the native enzyme, thus forming an enzymatic molecule whichcontains two Fab antibody fragments (and is thus bivalent with respectto antibody binding sites) and two molecules of CPG2 (FIG. 2a).

a) Cloning of the A5B7 Antibody Genes

Methods for the preparation, purification and characterisation ofrecombinant murine A5B7 F(ab′)₂ antibody have been published(International Patent Application, Zeneca Limited, WO 96/2001 1, seeReference Example 5 therein). In Reference Example 5, section f thereof,the A5B7 antibody genes were cloned into vectors of the GS-SYSTEM™(Celltech), see International Patent Applications WO 87/04462, WO89/01036, WO 86/05807 and WO 89/10404, with the A5B7 Fd cloned into pEE6and the light chain into pEE12. These vectors were the source of theA5B7 antibody genes for the construction of the ASB7 Fab-CPG2 fusionprotein.

b) Chimaeric A5B7 Vector Constructs

The A5B7 murine antibody variable regions were amplified by PCR from thepEE6 and pEE12 plasmid vectors using appropriate PCR primers whichincluded the necessary restriction sites for direct in frame cloning ofthe heavy and light chain variable regions into the vectorspNG4-VHss-HulgG2CH1′ (NCIMB deposit no. 40797) and pNG3-Vkss-HuCk-NEO(NCIMB deposit no. 40799) respectively. The resulting vectors weredesignated pNG4/A5B7VH-IgG2CH1′ (A5B7 chimaeric heavy chain Fd′) andpNG3/A5B7VK-HuCK-NEO (A5B7 chimaeric light chain).

c) Cloning of the CPG2 Gene

The CPG2 coding gene may be obtained from Centre for AppliedMicrobiology and Research, Porton Down, Salisbury, Wiltshire SP4 0JG,United Kingdom. CPG2 may also be obtained by recombinant techniques. Thenucleotide coding sequence for CPG2 has been published by Minton, N. P.et al., Gene, (1984) 31, 31-38. Expression of the coding sequence hasbeen reported in E.coli (Chambers, S. P. et al., Appl. Microbiol,Biotechnol. (1988), 29, 572-578) and in Saccharomyces cerevisiae(Clarke, L. E. et al., J. Gen Microbiol, (1985) 131, 897-904). Inaddition the CPG2 gene may be produced as a synthetic DNA construct by avariety of methods and used as a source for further experiments. Totalgene synthesis has been described by M. Edwards in Am. Biotech. Lab(1987), 5, 38-44, Jayaraman et al. (1991) Proc. Natl. Acad. Sci. USA 88,4084-4088, Foguet and Lubbert (1992) Biotechniques 13, 674-675 andPierce (1994) Biotechniques 16, 708.

In preparation for the cloning the CPG2 gene the vector pNG3-Vkss wasconstructed which is a simple derivative of pNG3-Vkss-HuCk-NEO (NCIMBdeposit no. 40799). This vector was constructed by first removing theNeomycin gene (since it contained an EcoRI restriction enzyme site) bydigestion with the restriction enzyme XbaI, after which the vectorfragment was isolated and then religated to form the plasmidpNG3/Vkss-HuCk. This intermediate vector was digested with the enzymesSacII and EcoRI, which excised the HuCk gene fragment. The digest wasthen loaded on a 1% agarose gel and the excised fragment separated fromthe remaining vector after which the vector DNA was cut from the gel andpurified. Two oligonucleotides CME 00261 and CME 00262 (SEQ ID NO: 1 and2) were designed and synthesised. These two oligonucleotides werehybridised by adding 200 pmoles of each oligonucleotide into a total of30 μl of H₂O, heating to 95° and allowing the solution to cool slowly to30°. 100 pmoles of the annealed DNA product was then ligated directlyinto the previously prepared vector and the ligation mix transformedinto E.coli. In the clones obtained, the introduction of the DNA“cassette” produced a new polylinker sequence in preparation for thesubsequent CPG2 gene cloning to produce the vector pNG3-Vkss.

The CPG2 structural gene encoding amino acid residues Q26-K415 inclusivewas amplified by PCR using appropriate DNA oligonucleotide primers andstandard PCR reaction conditions. The reaction product was analysesusing a 1% agarose gel, a band of the expected size (approximately 12000b.p.) was excised, purified and eluted in 20 μl H₂O. This material wasthen digested using the restriction enzyme SacII, after which thereaction was loaded on a 1% agarose gel and a band of the expected size(approximately 250 b.p.) was excised and subsequently purified. Thisfragment was ligated into the plasmid vector pNG3VKss, which had beenpreviously digested with the restriction enzyme SacII, dephosphorylated,run on a 1% agarose gel, the linearised vector band excised, purified,and the ligation mix transformed into E.coli. The resultant clones wereanalysed for the presence and orientation of the CPG2 SacII fragment byDNA restriction analysis using the enzymes BglII and FseI. Clones whichappeared to have a fragment of the correct size and orientation wereconfirmed by DNA sequencing. This intermediate plasmid was calledpNG3-Vkss-SacIICPG2frag. This plasmid was digested with the restrictionenzymes by AgeI and EcoRI, dephosphorylated and the vector fragmentisolated. The original CPG2 gene PCR product was also digested with AgeIand EcoRI, an approximately 1000 bp. fragment isolated, ligated andtransformed into E.coli. The resulting clones were analysed for a fulllength CPG2 gene (approximately 1200 bp.) by digestion with therestriction enzymes HindIII and EcoRI; clones with the correct sizeinsert were sequenced to confirm identity. Finally, this plasmid(pNG3/Vkss-CPG2) was digested with XbaI, dephosphorylated, a vectorfragment isolated and the XbaI Neomycin gene fragment (approximately1000 bp. which had also been isolated in the earlier stages) religatedinto the plasmid and transformed into E.coli. Resulting clones werechecked for the presence and orientation of the Neomycin gene byindividual digests with the enzymes XbaI and EcoRI. This vector wascalled pNG3-Vkss-CPG2-NEO.

d) Construction of the CPG2 R6 Variant

The plasmid pNG3-Vkss/CPG2-NEO was used as a template for the PCRmutagenesis of the CPG2 gene in order to mutate 3 potentialglycosylation sites which had been identified within the naturalbacterial enzyme sequence. The putative amino acid glycosylation sites(N-X-T/S) were observed at positions 222 (N-I-T), 264 (N-W-T), and 272(N-V-S) using the positional numbering published by Minton, N. P. etal., in Gene, (1984) 31, 31-38. The asparagine residue (N) of the 3glycosylation sites was mutated to glutamine (Q) thus negating theglycosylation sites to avoid any glycosylation events affecting CPG2expression or enzyme activity.

A PCR mutagenesis technique in which all 3 sites were mutated in asingle reaction series was used to create the CPG2 R6 gene variant. Thevector pNG3/Vkss/CPG2-NEO was used as the template for three initial PCRreactions. Reaction R1 used synthetic oligonucleotide sequence primersCME 00395 and CME 00397 (SEQ ID NOS: 3 and 4), reaction R2 usedsynthetic oligonucleotide sequence primers CME 00395 and CME 00399 (SEQID NOS: 3 and 5) and reaction R3 used synthetic oligonucleotide sequenceprimers CME 00396 and CME 00400 (SEQ ID NOS: 6 and 7). The products ofPCR reactions R1 and R2 contained the mutated 222 and 264+272glycosylation sites respectively, with the R3 product being a copy ofthe C-terminal segment of the CPG2 gene. The R2 and R3 products (R2approximately 750 bp; R3 approximately 360 bp), after agarose gelseparation and purification, were joined in a further PCR reaction.Mixtures of varying amounts of the products R2 and R3 were made and PCRreactions performed using the synthetic oligonucleotides CME 00395 andCME 00396 (SEQ ID NOS: 3 and 6). The resulting product R4 (approximately1200 bps) was again PCR amplified using the oligonucleotides CME 00398and CME 00396 (SEQ ID NOS: 8 and 6). The resulting product R5(approximately 600 bp.) was joined to product R1 (approximately 620b.p.) in a final PCR reaction performed using the oligonucleotides CME00395 and CME 00396 (SEQ ID NOS: 3 and 6). The resulting PCR product R6(approximately 1200 bp), which now contained all three mutatedglycosylation sites, could be cloned (after digestion with therestriction enzymes AgeI and BsrGI and isolation of the resultantfragment) into the vector pNG3/Vkss-CPG2-Neo.(which had been previouslycut with the restriction enzymes AgeI and Bsr GI and subsequentlyisolated). This created the desired DNA (SEQ ID NO: 9) encoding CPG2/R6protein sequence (SEQ ID NO: 10) within the expression vectorpNG3/Vkss-CPG2 R6-NEO.

e) Construction of the A5B7 Heavy Chain Fd-CPG2 Fusion Protein Gene

The heavy chain antibody fragment and the CPG2 enzyme genes were bothobtained by PCR amplification of plasmid templates. The plasmidpNG4/A5B7VH-IgG2CH1′ was amplified with primers CME 00966 (SEQ ID NO:11) and CME 00969 (SEQ ID NO: 12) to obtain the A5B7 Fd component(approximately 300 b.p.) and the plasmid pNG3/Vkss/CPG2 R6-NEO wasamplified with primers CME 00967 (SEQ ID NO: 13) and CME 00968 (SEQ IDNO: 14) to obtain the enzyme component (approximately 1350 b.p.). Ineach case the PCR reaction product was loaded and separated on a 1%agarose gel, a band of the correct product size excised, subsequentlypurified and eluted in 20 μl H₂O.

A further PCR reaction was performed to join (or splice) the twopurified PCR reaction products together. Standard PCR reactionconditions were used with varying amounts (between 0.5 to 2 μl) of eachPCR product but utilising 25 cycles (instead of the usual 15 cycles).The reaction product was analysed using a 1% agarose gel and a band ofthe expected size (approximately 1650 b.p.) was excised, purified andeluted in 20 μl H₂O. This material was then digested using restrictionenzymes NheI and BamHI, after which a band of the expected size(approximately 1600 b.p.) was recovered and purified. The vectorpNG4/A5B7VH-IgG2CH1′ was prepared to receive the above PCR product bydigestion with restriction enzymes NheI and BamHI, after which the DNAwas dephosphorylated and the larger vector band was separated from thesmaller NheI/Bam HI fragment. The vector band was recovered, purifiedand subsequently the similarly restricted PCR product was ligated in tothe prepared vector and the ligation mix transformed into E. coli. DNAwas prepared from the clones obtained and subsequently sequenced toconfirm the fusion gene sequence. A number of the clones were found tobe correct and one of these clones (designated R2.8) was re-namedpNG4/A5B7VH-IgG2CH1/CPG2 R6 (SEQ ID NO: 15 and SEQ ID NO: 16).

f) Co-transfection, Transient Expression

The plasmids pNG4/A5B7VH-IgG2CH1/CPG2 R6 (encoding the antibodychimaeric Fd-CPG2 fusion protein) and pNG3/A5B7VK-HuCK-NEO (encoding theantibody chimaeric light chain; SEQ ID NO: 17 and SEQ ID NO: 18) wereco-transfected into COS-7 cells using a LIPOFECTIN™ based procedure asdescribed below. COS7 cells are seeded into a 6 well plate at2×10⁵cells/2 ml/well, from a subconfluent culture and incubatedovernight at 37°, 5% CO₂. A LIPOFECTIN™/serum free medium mix is made upas follows: 12 ml LIPOFECTIN™ plus 200 ml serum free medium andincubated at room temperature for 30 minutes. A DNA/serum free mediummix is made up as follows: 4 mg DNA (2 mg of each construct) plus 200 mlserum free medium. 200 ml of the LIPOFECTIN™/serum free medium mix isthen added to the DNA mix and incubated for 15 minutes room temperature.600 ml of serum free medium was then added to each sample. The cellswere washed once with 2 ml serum free medium and then the 1 mlLIPOFECTIN™/DNA mix is added to the cells and incubated for 5 hours,37°, 5% CO₂. The LIPOFECTIN™/DNA mix was removed from the cells andnormal growth media added after which the cells were incubated for 72hours, 37°, 5% CO₂. The cell supernatants were harvested.

g) Analysis of Antibody-Enzyme Fusion Protein

The supernatant material was analysed for the presence of antibodyfusion protein using a CEA-binding ELISA using an anti human kappa lightchain reporter antibody (for presence of antibody), a CEA-binding ELISAusing an anti-CPG2 reporter antibody (for presence of CEA bound CPG2fusion protein), a HPLC based CPG2 enzyme activity assay (to measurespecific CPG2 activity) and SDS/PAGE followed by Western blotting (usingeither anti human kappa light chain reporter or anti CPG2 reporterantibodies) to detect expressed material.

The HPLC based enzyme activity assay clearly showed CPG2 enzyme activityto be present in the cell supernatant and both the anti-CEA ELISA assaysexhibited binding of protein at levels commensurate with a bivalent A5B7antibody molecule. The fact that the anti-CEA ELISA detected with ananti-CPG2 reporter antibody also exhibited clear CEA binding indicatedthat not only antibody but also antibody-CPG2 fusion protein was bindingCEA.

Western blot analysis with both reporter antibody assays clearlydisplayed a fusion protein subunit of the expected approximately 90 kDasize with no degradation or smaller products (such as Fab or enzyme)observable.

Since CPG2 is known only to exhibit enzyme activity when it is in adimeric state and since only antibody enzyme fusion protein is present,this indicates that the 90 kDa fusion protein (seen under SDS/PAGEconditions) dimerises via the natural CPG2 dimerisation mechanism toform a 180 kDa dimeric antibody-enzyme fusion protein molecule (FIG. 2a)in “native” buffer conditions. Furthermore, this molecule exhibits bothCPG2 enzymatic activity and CEA antigen binding properties which do notappear to be significantly different in the fusion protein compared withenzyme or antibody alone.

h) Use of Expressed Fusion Protein and CPG2 Prodrug in an In VitroCytotoxicity Assay

An in vitro cell killing assay was performed in which the (A5B7-CPG2R6)₂ fusion protein was compared to a “conventional” A5B7 F(ab′)₂-CPG2conjugate formed through linking A5B7 F(ab′)₂ to CPG2 with a chemicalheterobifunctional reagent. In each case material displaying equalamounts of CPG2 enzyme activity or equal amounts of antibody-CPG2protein were incubated with LoVo, CEA bearing, tumour cells. The cellswere then washed to remove unbound protein material and subsequentlyresuspended in medium containing a CPG2 phenol prodrug (PGP, see Example2 below) for a period of 1 hr, after which the cells were washed,resuspended in fresh media and left to proliferate for 4 days. Finallythe cells were treated with SRB stain and their numbers determined.

The results obtained clearly showed that the (A5B7-CPG2 R6)₂ fusionprotein (together with prodrug) caused at least equivalent cell kill andresulted in lower numbers of cells at the end of the assay period thanthe equivalent levels of A5B7 F(ab)₂-CPG2 conjugate (with the sameprodrug). Cell killing (above basal control levels) can only occur ifthe prodrug is converted to active drug by the CPG2 enzyme (and sincethe cells are washed to remove unbound protein, only cell bound enzymewill remain at the stage where the prodrug is added). Thus thisexperiment shows that at least as much of the A5B7-CPG2 R6 fusionprotein remains bound compared with conventional A5B7 F(ab)₂-CPG2conjugate as a greater degree of cell killing (presumably due to higherprodrug to drug conversion) occurs.

i) Construction of a Coexpression Fusion Protein Vector for use inTransient and Stable Cell Line Expression

For a simpler transfection methodology and the direct coupling of bothexpression cassettes to a single selection marker, a co-expressionvector for fusion protein expression was constructed using the existingvectors pNG4/A5B7VH-IgG2CH1/CPG2 R6 (encoding the antibody Fd-CPG2fusion protein) and pNG3/A5B7VK-HuCK-NEO (encoding the antibody lightchain). The pNG4/A5B7VH-IgG2CH1/CPG2 R6 plasmid was first digested withthe restriction enzyme Scal, the reaction loaded on a 1% agarose gel andthe linear vector band excised from the gel and purified. This vectorDNA was then digested with restriction enzymes BglII and BamHI, thereaction loaded on a 1% agarose gel, the desired band (approximately2700 bp) recovered and purified. The plasmid pNG3/A5B7VK-HuCK-NEO wasdigested with the restriction enzyme BamHI after which the DNA wasdephosphorylated then subsequently loaded on a 1% agarose gel and thevector band excised from the gel and purified. The heavy chainexpression cassette fragment was ligated in to the prepared vector andthe ligation mix transformed into E. coli. The orientation was checkedby a variety of restriction digests and clones selected which had theheavy chain cassette in the same direction as that of the light chain.These plasmids were termed pNG3-A5B7-CPG2/R6-coexp.-NEO.

j) Gene Switches for Protein Expression

It is foreseen that in vitro expression of CPG2 and CPG2 fusion proteinsin mammalian cells may degrade media folates leading to slow cell growthor cell death. The high activity of the CPG2 enzyme is likely to makesuch a folate deficiency difficult to overcome by media supplementation.However, it is thought that in the case of CPG2 or CPG2 fusion proteinexpression from mammalian cells in vivo, it is unlikely that suchproblems will occur, since the cells would be constantly replenishedwith all growth requirements by the normal circulatory and cellularmechanisms.

A number of options to avoid possible in vitro folic acid depletionproblems have been considered. One of these solutions involve the use oftightly controlled but inducible gene switch systems such as the “TETon” or “TET off” switches (Grossen, M. et al (1995) Science 268:1766-1769) or the ecdysone/muristerone A switch (No, D. et al (1996)PNAS 93 :3346-3351). Such systems enable precisely controlled expressionof a gene of interest and allow stable transformation of mammalian cellswith genes encoding toxic or potentially deleterious expressionproducts. A gene switch would allow recombinant stable cell linesincorporating CPG2 fusion genes to be potentially more easilyestablished, maintained and expanded for protein expression and seedingcultures for in vivo tumour growth studies.

EXAMPLE 2

HCT116 Tumour Cells Expressing the Antibody-enzyme Fusion Protein areSelectively Killed in vitro by a Prodrug

HCT 116 colorectal tumour cells (ATCC CCL 247) transfected with theantibody-CPG2 fusion protein gene of Example 1 can be selectively killedby a prodrug that is converted by the enzyme into an active drug.

To demonstrate this, control non-transfected HCT116 cells or HCT116cells transfected with the antibody-CPG2 fusion protein gene, areincubated with either the prodrug,4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl-L-glutamic acid (PGP;Blakey et al, Br. J. cancer 72, 1083, 1995) or the corresponding drugreleased by CPG2, 4-[N,N-bis(2-chloroethyl)amino] phenol. PGP prodrugand drug over the concentration range of 5×10⁻⁴ to 5×10⁻⁸ M are added to96 well microtitre plates containing 1000-2,500 HCT116 cells/well, for 1hr at 37°. The cells are then washed and incubated for a further threedays at 37°. After washing to remove dead cells, TCA is then added andthe amount of cellular protein adhering to the plates is assessed byaddition of SRB dye as described by Skehan et al (J. Natl. Cancer Inst.82, 1107, 1990). Potency of the prodrug and drug is assessed by theconcentration required to inhibit cell growth by 50% (IC₅₀).

Treatment of non-transfected or transfected HCT116 cells with the drugresults in an IC₅₀ of approximately 1 μM. In contrast, the PGP prodrugresults in an IC₅₀ of approximately 200 μM on non-transfected cells andapproximately 1 μM on transfected cells. These results demonstrate thatthe transfected cells which express the antibody-CPG2 fusion protein canconvert the PGP prodrug into the more potent active drug whilenon-transfected HCT116 cells are unable to convert the prodrug.Consequently the transfected HCT116 cells are over 100 fold moresensitive to the PGP prodrug in terms of cell killing compared to thenon-transfected HCT116 cells. (See Example 1j) for issues involvingpossible folic acid depletion in cells).

These studies demonstrate that transfecting tumour cells with a gene foran antibody-enzyme fusion protein can lead to selective tumour cellkilling with a prodrug.

EXAMPLE 3

Anti-tumour Activity of PGP Prodrug in HCT116 Tumours Expressing theAntibody-CPG2 Fusion Protein

The anti-tumour activity in vivo of the PGP prodrug in HCT116 tumoursexpressing the antibody-CPG2 fusion protein can be demonstrated asfollows. HCT116 tumour cells transfected with the antibody-CPG2 fusionprotein gene or control non-transfected HCT116 tumour cells are injectedsubcutaneously into athymic nude mice (10⁷ tumour cells per mouse). Whenthe turnours are 5-7 mm in diameter the PGP prodrug is administered i.p.to the mice (3 doses at hourly intervals over 2 h in dose ranges of 5-25mg kg⁻¹). The anti-tumour effects are judged by measuring the length ofthe tumours in two directions and calculating the tumour volume usingthe formula:

Volume=π/6×D ² ×d

where D is the larger diameter and d is the smaller diameter of thetumour.

Tumour volume is expressed relative to the tumour volume at the time thePGP prodrug is administered. The anti-tumour activity is compared to acontrol group receiving either transfected or non-transfected tumourcells and PBS (170 mM NaCl, 3.4 mM KCl, 12 mM Na₂HPO₄ and 1.8 mM KH₂PO₄,pH 7.2) instead of the PGP prodrug.

Administration of PGP to HCT116 tumours established from transfectedHCT116 cells results in a significant anti-tumour effect as judged bythe PGP treated tumours decreasing in size compared to the PBS treatedtumours and it taking a significantly longer time for the PGP treatedtumours to reach 4 times their initial tumour volume compared to PBStreated tumours. In contrast, administration of PGP to HCT116 tumoursestablished from non-transfected cells resulted in no significantanti-tumour activity.

Similar studies can be used to demonstrate that the antibody-enzyme genedelivered in an appropriate vector to established HCT116 tumoursproduced from non-transfected HCT116 cells when used in combination withthe PGP prodrug can result in significant anti-tumour activity. Thusnon-transfected HCT116 cells are injected into athymic nude mice (1×10⁷tumour cells per mouse) and once the tumours are 5-7 mm in diameter thevector containing the antibody-enzyme fusion protein gene is injectedintra-tumourally. After 1-3 days to allow the antibody-enzyme fusionprotein to be expressed by and bind to the HCT116 tumour cells, the PGPprodrug is administered as described above. This results in significantanti-tumour activity compared to control mice receiving PBS instead ofPGP prodrug.

EXAMPLE 4

Improved Transfection of Adherent Cell Lines Using Supplemented FASMedia and/or V-79 Feeder Cells

It was foreseen that in vitro expression of CPG2 and CPG2 fusionproteins in mammalian cells may degrade media folates leading to slowcell growth or cell death. FAS (folinic acid supplemented) mediadescribed herein was developed for CPG2 and CPG2 fusion proteinexpressing cell lines in order to better support the growth of such celllines.

In preparation for transfection, adherent cell lines were cultured innormal DMEM edia and passaged at least three times before transfection.V-79 (hamster lung fibroblast, obtained from MRC Radiobiology Unit,Harwell, Oxford, United Kingdom) feeder cells were cultured in normalDMEM media and passaged three times before use. For the transfection, aviable count (using a haemocytometer/trypan blue staining) of theadherent cells was made and the cells plated out at 2×10⁵ cells per wellinto a 6 well plate (Costar 3516) and left for 18-24 hours for the cellsto re-adhere.

For each individual transfection, 20 μl of LIPOFECTIN™ was added to 80μl serum free medium and left at room temperature for 30 minutes.Plasmid DNA (2 μg) of interest was added to 100 μl serum free medium andsubsequently added to the LIPOFECTIN™ mix and left for a further 15minutes. The individual 6 well plates were washed with 2 ml serum freemedium per well to remove any serum and replaced with 800 μl of freshserum free medium. The 200 μl DNA/LIPOFECTIN™/serum free medium mixeswhich had been previously prepared were then added to each well ofcells. The plates were incubated at 37° for 5 hours, the media removedand 2 ml of fresh normal media added and incubated for a further 48hours. The transfected cells in the 6 well plate were scraped free, thecell suspension removed and centrifuged. All the supernatant was removedand the cell pellet resuspended in 20 ml of the appropriate fresh growthmedia (e.g. FAS DMEM media) containing the appropriate selective agentfor the transfected DNA (e.g. G418). Aliquots (200 μl) were plated perwell into a 96 well plate (1.25×10⁴ cells per well).

To enhance clone expansion, fibroblast feeder cells may be added to thetransfected cells. Semi-confluent V-79 feeder cells were trypsinised anda viable count performed. The cells were resuspended to 1×10⁶ cells /mlin a sterile glass container, irradiated using a Caesium source byexposure to 5000 rads over 12 minutes. The cells can then be stored at4° for 24-48 hours (irradiated cells are metabolically active but willnot divide, and so can act as “feeders” for other cells withoutcontaminating the culture). The feeder cells should be plated out at4×10⁴ cells per well in a 96 well plate to produce a confluent layer forthe emerging recombinant clones. Feeder cells initially adhere to theplate but with time detach and float off into the media, leaving the anyrecombinant clone still attached to the well. Media changes (200 μl attime) are performed twice weekly to remove floating cells and replenishmedia. Colonies were allowed to develop for 10-14 days, then thesupematant screened by standard ELISA assay for fusion proteinsecretion.

To measure the expression rate in the case of the (A5B7-CPG2)₂ fusiongene constructs, recombinant cells were seeded out at 1×10⁶ in 10 mlfresh normal culture media for exactly 24 hours. The supernatant wasthen removed, centrifuged to remove cell debris and assayed for fusionprotein and enzyme activity by the ELISA and HPLC methods describedabove. The results for a number of recombinant (A5B7-CPG2)₂ fusionprotein cell lines are shown below.

Cell Line Clone ng/10⁶cells/24 h HCT 116 F7 6550 C12 3210 HCT 116 F615560 C1 6151 B3 4502 A8 4650 D5 630 H9 610 G11 2081 H4 2380 A4 1634LoVo B9 8370 C1 7350 F12 2983 C7 10770 G10 4140 Colo 320DM B3 10540 G44720 B9 885 B10 3090 F12 35660

EXAMPLE 5

Construction of a Stable Inducible (A5B7-CPG2)₂ Fusion ProteinExpressing Tumour Cell Line

a) Construction of an Inducible Fusion Protein Expression Vector

To facilitate expression from a single inducible mammalian cellpromoter, an IRES (Internal Ribosome Entry Site; see Y. Sugimoto et al.,Biotechnology (1994), 12, 694-8) based version of the (A5B7-CPG2)₂fusion protein was constructed. Construct pNG3 pNG3/A5B7VK-HuCK-NEO(A5B7 chimaeric light chain; described in Example 1b above) was used asa template for amplification of the light chain gene. The gene wasamplified using oligonucleotides CME 3153 and CME 3231 (SEQ ID NOS 19and 20). A PCR product of the expected size (approximately 700 b.p.) waspurified. This product was then digested using the restriction enzymesEcoRI and BamHI and subsequently purified. The fragment was cloned intothe Bluescript™ KS+vector (prepared to receive the fragment by digestionwith the same restriction enzymes, EcoRI and BamHI) after which the DNAwas dephosphorylated and the larger vector band purified. The similarlyrestricted PCR fragment ligated in to the prepared vector and theligation mix was transformed into E. coli. DNA was prepared from theclones obtained and analysed by restriction digestion to check forinsertion of PCR fragment. Appropriate clones were sequenced to confirmthe gene sequence. A number of the clones with the correct sequence wereobtained and one of these clones was given the plasmid designation ASB7Bluescript™.

In a similar manner, the chimaeric A5B7 heavy chain was amplified by PCRfrom the plasmid pNG4/A5B7VH-IgG2CH1/CPG2 R6 (described in Example 1eabove) using oligonucleotides CME 3151 and CME 3152 (SEQ ID NOS 21 and22). A PCR reaction product of the expected size (approximately 1800b.p.) was purified. This product was then digested using the restrictionenzymes BamHI and Xba I after which the fragment band was purified. Thefragment was also cloned into the Bluescript™ KS+vector which had beenprepared to receive the above fragment by digestion with the samerestriction enzymes, 10 BamHI and XbaI, after which the DNA wasdephosphorylated and the larger vector band was purified. The similarlyrestricted PCR fragment was ligated in to the prepared vector and theligation mix was transformed into E. coli. DNA was prepared from theclones obtained and analysed by restriction digestion to check forinsertion of PCR fragment. Appropriate clones were sequenced to confirmthe gene sequence. A number of the clones with the correct sequence wereobtained and one of these clones was given the plasmid designationBluescript™ Fd-CPG2 R6.

The IRES sequence was sourced from the vector pSXLC (described in Y.Sugimoto et al. Biotechnology (1994), 12, 694-8, and obtained from theauthors). The IRES sequence was excised by digestion with therestriction enzymes BamHI and NcoI. A band of the expected size(approximately 500 b.p.) was purified and ligated into the Bluescript™Fd-CPG2 R6 plasmid (which had previously been prepared by restrictionwith the same enzymes). The ligation mix was transformed into E. coliand DNA was prepared from the clones obtained. The DNA was analysed byrestriction digestion to check for insertion of the fragment andappropriate clones were subsequently sequenced to confirm the genesequence. A number of the clones with the correct sequence were obtainedand one of these clones was given the plasmid designation Bluescript™IRES Fd-CPG2 R6.

To facilitate later cloning steps, it was necessary to delete the Xba Isite which had been carried over in the IRES fragment. This wasperformed by PCR mutagenesis with the oligonucleotide primers CME 3322and CME 3306 (SEQ ID NOS: 23 and 24) and the Bluescript™ IRES Fd-CPG2 R6as template DNA. A PCR reaction product of the expected size(approximately 500 b.p.) was purified, digested with the restrictionenzymes BamHI and NcoI and ligated into the Bluescript™ IRES Fd-CPG2 R6plasmid (which had previously been prepared by restriction with the samerestriction enzymes). The ligation mix was transformed into E. coli andDNA was prepared from the clones obtained. The DNA was analysed byrestriction digestion to check for insertion of the fragment andappropriate clones were subsequently sequenced to confirm the genesequence. A number of the clones with the correct sequence were obtainedand one of these clones was given the plasmid designation Bluescript™IRES Fd-CPG2 R6-Xba del.

The A5B7 chimaeric light chain fragmentwas excised from the A5B7Bluescript™ plasmid by digestion with the restriction enzymes EcoRI andBamHI. A band of the expected size (approximately 700 b.p.) waspurified, ligated into the appropriately prepared Bluescript IRESFd-CPG2 R6-Xba del plasmid and the ligation mix was transformed into E.coli. DNA was prepared from the clones obtained and analysed byrestriction digestion to check for insertion of the fragment.Appropriate clones were subsequently sequenced to confirm the genesequence. A number of the clones with the correct sequence were obtainedand one of these clones was given the plasmid designation Bluescript™A5B7 IRES Fd-CPG2 R6-Xba del. The complete IRES based A5B7 chimaericfusion protein gene sequence is shown in SEQ ID NO: 52.

The IRES based A5B7 chimaeric fusion protein gene was then transferredto a tetracycline regulated expression vector. Vectors for the Tet Ongene expression system were obtained from Clontech. The Tetracyclineswitchable expression vector pTRE (otherwise known as pHUD10-3, seeGossen et al. (1992), PNAS, 89, 5547-51) was prepared to accept the IRESbased fusion protein cassette by digestion with the restriction enzymesEcoRI and XbaI, dephosphorylated and the larger vector band purified.The IRES gene cassette was excised from the Bluescript™ A5B7 IRESFd-CPG2 R6-Xba del plasmid using the same restriction enzymes. Theapproximately 3000 b.p. fragment obtained was ligated in to the preparedvector and the ligation mix was transformed into E. coli. DNA wasprepared from the clones obtained and analysed by restriction digestionto check for insertion of PCR fragment. Appropriate clones weresubsequently sequenced to confirm the gene sequence. A number of theclones with the correct sequence were obtained and one of these cloneswas given the plasmid designation pHUD10-3/A5B7 IRES Fd-CPG2 R6.

b) Construction of a Stable Inducible Fusion Protein Expressing CellLine

The standard lipofection transfection methodology (as describedpreviously but without the use of feeder cells) was used to producerecombinant HCT116 tumour cell lines A co-transfection using 1 g of thepHUD10-3/A5B7 IRES Fd-CPG2 R6 plasmid and 1 μg of the pTet-Ontransactivator expressing plasmid (from the Clontech kit) was performedand positive clones selected using FAS media containing 750 μg G418/ml.

c) Induction Studies of Recombinant HCT116 Inducible Cell Lines

The clone cultures obtained were split in to duplicate 48 well plates,each containing 1×10⁶ cells. The cells were grown for 48 h with one ofthe plates induced with 2 μg/ml doxycycline and the other acting as annon-induced control. Expression of the (A5B7-CPG2)₂ fusion protein inthe cell supernatant was tested using the ELISA/Western blot assaysdescribed in Example 1g. The results indicated that induction of fusionprotein from the inducible cell line by use of doxycycline could beclearly demonstrated, for example one of the clones obtained (F11), theinduced cells produced 120 ng/ml of fusion protein in the supernatantwhereas the non-induced cells produced only background levels of fusionprotein (below 1 ng/ml).

EXAMPLE 6

Cell Based ELISA Assay of Secreted Fusion Protein Material

Cells were seeded into 96 well plates (Becton Dickinson Biocoat™poly-D-Lysine, 35-6461) at a density of 1×10⁴ cells per well in 100 μlnormal culture media and left about 40 h at 37°. 100 μl of 6%formaldehyde was diluted in DMEM and left for 1 hour at 4°. Plates werecentrifuged and washed 3 times in PBS containing 0.05% Tween™ byimmersion soaking (first two washes for 2 minutes and the final wash for5 minutes).

100 μl of doubling dilutions of cell culture supernatant containingfusion protein or chimeric A5B7 anti-CEA were added to each well asappropriate and the plates incubated overnight at 4°. The plates werewashed as described above and, in the case of chimaeric fusion proteins.100 μl of 1:1000 dilution of HRP labelled anti-human kappa antibody(Sigma A-7164) was added and incubated for 2 hours at room temperature(an anti-CPG2 detection methodology can be used in the case of murinescFv fusion proteins). The plates were washed as described above and HRPdetected using OPD substrate (Sigma P-8412). Colour was allowed todevelop for about 5 min, stopped with 75 μl per well of 2M H₂SO₄ and ODread at 490 nm.

In the case of the (A5B7-CPG2)₂ fusion protein, material was produced inthe supernatant from recombinant Colo32ODM tumour cells (CEA-ve). Thefusion protein content was measured by use of the CEA ELISAs describedabove. Increasing amounts of fusion protein were added to a number ofCEA negative cell lines and the CEA positive LoVo parental line. Theresults shown in FIG. 3 clearly show that only the CEA positive lineshows increased levels of binding with increasing amounts of addedfusion protein whereas the CEA negative cell lines show only constantbackground binding levels throughout. This clearly demonstrates that thefusion protein specifically binds and is retained on CEA positive Lovocells.

EXAMPLE 7

Recombinant LoVo Tumour Cells Expressing Antibody-enzyme Fusion ProteinExhibit Retention of the Fusion Protein on the Cell Surface

LoVo colorectal tumour cells transfected with the (A5B7-CPG2)₂ fusionprotein gene have been shown both to secrete and to retain the fusionprotein on their cell surface. This can be demonstrated by comparingparental and recombinant fusion protein expressing LoVo cells under theconditions set out in the cell based ELISA assay of secreted fusionprotein (FIG. 4). On development of the colour reaction it could be seenthat the recombinant LoVo cells had retained the expressed fusionprotein (by showing a high level of colour). In control experiments,using Colo320DM fusion protein expressing cells, the assay showed someretention of the expressed fusion protein (probably non-specific) andthe parental LoVo cells only exhibited background activity. Positivecontrols in which CEA binding antibody was added to test recombinantfusion protein expressing tumour cells and to the parental LoVo controlsresulted in a signal being obtained from the parental LoVo (thusdemonstrating that CEA was present on the parental cells) but noincreased signal from the Colo320DM (CEA negative). The recombinant LoVocells still gave such a strong initial signal that the added antibodymade little difference to the overall signal obtained, which wasconsiderably higher than any of the control experiments. Thus it appearsthat anti-CEA antibody enzyme-CPG2 fusion protein secreted from CEApositive tumour cell lines bind to the surface of the cells (via CEA)whereas the same protein expressed from CEA negative tumours shows nosuch binding.

EXAMPLE 8

LoVo Tumour Cells Expressing the Antibody-enzyme Fusion Protein areSelectively Killed in vitro by a Prodrug

LoVo colorectal tumour cells, transfected with the (A5B7-CPG2)₂ fusionprotein gene, can be selectively killed by a prodrug that is convertedby CPG2 enzyme into an active drug.

To demonstrate this control non-transfected LoVo cells or LoVo cellstransfected with an antibody-CPG2 fusion protein gene are incubated witheither the prodrug,4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl-L-glutamic acid (PGP;Blakey et al, (1995) Br. J. cancer 72, p1083) or the corresponding drugreleased by CPG2, 4-[N,N-bis(2-chloroethyl)amino] phenol as described inExample 2 with HCT116 cells.

The transfected cells which express the antibody-CPG2 fusion protein canconvert the PGP prodrug into the more potent active drug whilenon-transfected LoVo cells are unable to convert the prodrug.

These studies demonstrate that transfecting tumour cells with a gene foran antibody-enzyme fusion protein can lead to selective tumour cellkilling with a prodrug.

EXAMPLE 9

Establishment of Fusion Protein Expressing LoVo Tumour Xenografts inAthymic Mice

Recombinant LoVo fusion protein (A5B7-CPG2)₂ expressing tumour cells ormixes of recombinant and parental LoVo cells were injectedsubcutaneously into athymic nude mice (10⁷ tumour cells per mouse). Thetumour growth rates for both 100% recombinant and 20%: 80% mixes ofrecombinant:parental LoVo cells were compared to those of parental cellonly tumours. No significant differences were seen in the observedgrowth curves obtained showing no corrections were required duringcomparisons between the cell lines. The tumour growth rates observedshowed that in each case for the xenograft tumours to reach a size of10×10 mm takes about 12 days.

EXAMPLE 10

Determination of Enzyme Activity in Tumour Xenograft Samples

To act as a standard for the assay, a CPG2 enzyme standard curve wasprepared in 20% homogenate of normal tumour (parental cell tumour).Subsequent dilutions of samples were made in the same 20% homogenate ofnormal tumour.

Excised tumour tissue is removed from −80° storage (previously flashfrozen in liquid nitrogen) and allowed to thaw. Any residual skin tissuewas removed before the tumour was cut up in to small fragments with ascalpel. The tumour tissue was transferred to a preweighed tube and theweight of tumour tissue measured. PBS containing 0.2 mM ZnCl₂ solutionwas added to each tumour sample to give a 20% (w/v) mix, homogenised andplaced on ice. Dilutions of sample tumours (in 20% normal tumourhomogenate) were prepared e.g. neat, 1/10, 1/20 and 1/40.

For the standard curve, dilutions of CPG2 enzyme were made to thefollowing concentrations to a final volume of 400 μl. Similarly, 400 μlof each of the recombinant tumour sample dilutions were also prepared.After equilibration to 30°, 4 μl of 10 mM methotrexate (MTX) solutionwas added. The reaction was stopped after exactly 10 minutes by adding600 μl ice cold methanol+0.2% TFA, centrifuged and the supematantcollected. The substrate and product in the supernatant were thenseparated by HPLC (using a Cation Exchange Column, HICROM™ S5SCX-100 A,mobile phase=60% methanol, 40% 60 mM ammonium formate/0.1% TFA,detection 300 nm). To calculate enzyme activity in the tumour tissue,the standard curve was plotted as units of area of methotrexatemetabolite (the standards are such that only 20-30% of the substrate ismetabolised so ensuring this is not rate limiting). The test sampleswere analysed by comparing the unit area of metabolite against thestandard curve and then multiplying by the dilution factor. Finally,making the working assumption that 1 ml=1 g the results were multipliedby 5 (as the samples were originally diluted to a 20% homogenate).

Results obtained with 20% recombinant: 80% parental LoVo cellsexpressing (A5B7 Fab-CPG2)₂ fusion protein showed the following results:tumours taken at day 5 had an average enzyme activity=0.26 U/g (rangebetween 0.18-0.36 U/g) and at day 12 had an average enzyme activity=0.65U/g (range between 0.19-1.1 U/g).

EXAMPLE 11

Determination Enzyme Activity in Plasma Samples

To act as a standard for the assay, a CPG2 enzyme standard curve wasprepared in 20% normal plasma to the following concentrations: 0.2, 0.4,0.6, 0.8 and 1.0 U/ml. Similarly all test plasma samples were alsodiluted to 20% normal plasma. Further dilutions of these samples e.g.neat 1/10, 1/20 and 1/50 were also made using 20% normal serum. 200 μlaliquots of each CPG2 standard and test sample dilutions wereequilibrated to 30°. 2 μl of 10 mM MTX was added to each of the tubesand mixed well. to 30°. The reaction was stopped after exactly 10minutes (to increase the sensitivity of the assay the incubation timecan be increased to 30 minutes) by adding 500 μl ice cold methanol+0.2%TFA and assayed for product using HPLC detection as described above inExample 10.

No activity was seen in the plasma except in the rare cases when thelevel of enzyme activity in the tumour exceeded 2.0 U/g, in which casethe plasma enzyme levels were measured in the range of 0.013 to 0.045U/ml.

EXAMPLE 12

Anti-tumour Activity of PGP Prodrug in LoVo Tumours Expressing theAntibody-CPG2 Fusion Protein

Recombinant LoVo (A5B7-CPG2)₂ fusion protein expressing tumour cells ormixes of recombinant and parental LoVo cells were injectedsubcutaneously into athymic nude mice as described in Example 9.

When the tumours are 5-7 mm in diameter the PGP prodrug is administeredi.p. to the mice (3 doses in DMSO/0.15 M sodium bicarbonate buffer athourly intervals over 2 h in dose ranges of 40-80 mg kg⁻¹).

Anti-tumour effects are judged by measuring the length of the tumours intwo directions and calculating the tumour volume using the formula

Volume=π/6×D ² ×d

where D is the larger diameter and d is the smaller diameter of thetumour. Tumour volume may be expressed relative to the tumour volume atthe time the PGP prodrug is administered or alternatively the mediantumouT volumes may be calculated. The anti-tumour activity is comparedto control groups receiving either transfected or non-transfected tumourcells and buffer without PGP prodrug.

Administration of PGP to LoVo tumours established from recombinant LoVocells or recombinant Lovo/Parental LoVo cell mixes results in asignificant anti-tumour effect as shown by the PGP treated tumoursdecreasing in size compared with controls and it taking a significantlylonger time for the PGP treated tumours to reach 4 times their initialtumour volume compared with controls (FIG. 5). Administration of PGP toLoVo tumours established from non-transfected cells resulted in nosignificant anti-tumour activity.

Similar studies can be used to demonstrate that the antibody-enzyme genedelivered in an appropriate gene delivery vector to established LoVotumours produced from non-transfected parental LoVo cells when used incombination with the PGP prodrug can result in significant anti-tumouractivity. Thus non-transfected LoVo cells are injected into athymic nudemice (1×10⁷ tumour cells per mouse) and once the tumours are 5-7 mm indiameter the vector containing the antibody-enzyme fusion protein geneis injected intra-tumourally. After 1-3 days to allow theantibody-enzyme fusion protein to be expressed by, and bind to, the LoVotumour cells, the PGP prodrug is administered as described above. Thisresults in significant anti-tumour activity compared with controls.

EXAMPLE 13

Construction of an (806.077 Fab-CPG2)₂ Fusion Protein

The construction of a (806.077 Fab-CPG2)₂ enzyme fusion was planned withthe aim of obtaining a bivalent human carcinoembryonic antigen (CEA)binding molecule which also exhibits CPG2 enzyme activity. To this endthe initial construct was designed to contain an 806.077 antibody heavychain Fd fragment linked at its C-terminus via a flexible (G₄S)₃ peptidelinker to the N-terminus of the CPG2 polypeptide (as shown in FIG. 1 butsubstituting 806.077 in place of A5B7).

The antibody 806.077 (described in International Patent Application WO97/42329, Zeneca Limited) binds with a very high degree of specificityto human CEA. Thus the 806.077 antibody is particularly suitable fortargeting colorectal carcinoma or other CEA antigen bearing cells.

In general, antibody (or antibody fragment)-enzyme conjugate or fusionproteins should be at least divalent, that is to say capable of bindingat least 2 tumour associated antigens (which may be the same ordifferent). In the case of the (806.077 Fab-CPG2)₂ fusion protein,dimerisation of the enzyme component takes place (after expression, aswith the native enzyme) thus forming an enzymatic molecule whichcontains two Fab antibody fragments (and is thus bivalent with respectto antibody binding sites) and two molecules of CPG2 (FIG. 2a).

a) Cloning of the 806.077 Antibody Genes

Methods for the cloning and characterisation of recombinant murine806.077 F(ab′)₂ antibody have been published (International PatentApplication WO 97/42329, Example 7). Reference Example 7.5, describescloning of the 806.077 antibody variable region genes into Bluescript™KS+ vectors. These vectors were subsequently used as the source of the806.077 variable region genes for the construction of 806.077 chimaericlight and heavy chain Fd genes.

b) Chimaeric 806.077 Antibody Vector Constructs

International Patent Application WO 97/42329, Example 8 describes thecloning of the 806.077 chimaeric light and heavy chain Fd genes in thevectors pNG3-Vkss-HuCk-NEO (NCIMB deposit no. 40799) andpNG4-VHss-HuIgG2CH1′ (NCIMB deposit no. 40797) respectively. Theresulting vectors were designated pNG4/VHss806.077VH-IgG2CH1′ (806.077chimaeric heavy chain Fd′) and pNG3/VKss806.077VK-HuCK-NEO (806.077chimaeric light chain). These vectors were the source of the 806.077antibody genes for the construction of the 806.077 Fab-CPG2 fusionprotein.

c) Construction of the 806.077 Heavy Chain Fd-CPG2 Fusion Protein Gene

The cloning and construction of the CPG2 gene used are described inExample 1, sections c and d. Similarly, the construction of thepNG4/A5B7VH-IgG2CH1/CPG2 R6 vector, which was used for the constuctionof the 806.077 heavy chain Fd-CPG2, is described in Example 1, sectione. The 806.077 variable heavy chain gene was removed from the pNG4Hss806.077VH-IgG2CH1′ vector by digestion with restriction enzymesHindIII and NheI and a band of the expected size (approximately 300 b.p)which contained the variable region gene was purified. The samerestriction enzymes (HindII/NheI) were used to digest the vectorpNG4/A5B7VH-IgG2CHI/CPG2 R6 in preparation for the substitution of the806.077 variable region for that of the A5B7 antibody. After digestion,the DNA was dephosphorylated then the larger vector band was separatedand purified. The similarly restricted variable region gene fragment wasthen ligated in to this prepared vector and the ligation mix transformedinto E. coli. DNA was prepared from the clones obtained and analysed byrestriction digest analysis and subsequently sequenced to confirm thefusion gene sequence. A number of the clones were found to be correctand one of these clones, pNG4/VHss806VH-IgG2CH1/CPG2 R6, was chosen forfurther work. The sequence of the 806.077 heavy chain Fd-CPG2 fusionprotein gene created is shown SEQ ID NOS 25 and 26.

d) Co-transfection, Transient Expression and Analysis of Fusion Protein

The plasmids pNG4/VHss806.077VH-IgG2CH1/CPG2 R6 (encoding the antibodychimaeric Fd-CPG2 fusion protein) and pNG3/VHss806.077VK-HuCK-NEO(encoding the antibody chimaeric light chain) were co-transfected intoCOS-7 cells using a LIPOFECTIN™ based procedure described in Example 1fabove. Analysis of the fusion protein was performed as described inExample 1g. The HPLC based enzyme activity assay clearly showed CPG2enzyme activity to be present in the cell supernatant and both theanti-CEA ELISA assays exhibited binding of protein at levelscommensurate with a bivalent 806.077 antibody molecule. The fact thatthe anti-CEA ELISA detected with an anti-CPG2 reporter antibody alsoexhibited clear CEA binding indicated that not only antibody but alsoantibody-CPG2 fusion protein was binding CEA. Western blot analysis withboth reporter antibody assays clearly displayed a (806.077 Fab-CPG2)₂fusion protein subunit of the expected approximately 90 kDa size withonly a small amount of degradation or smaller products (such as Fab orenzyme) observable. Since CPG2 is only known to exhibit enzyme activitywhen it is in a dimeric state it and since only antibody enzyme fusionprotein is present, this indicates that the 90 kDa fusion protein (seenunder SDS/PAGE conditions) dimerises via the natural CPG2 dimerisationmechanism to form a 180 kDa dimeric antibody-enzyme fusion proteinmolecule (FIG. 2a) in “native” buffer conditions. Furthermore, thismolecule exhibits both CPG2 enzymatic activity and CEA antigen bindingproperties which do not appear to be significantly different in thefusion protein compared with enzyme or antibody alone.

e) Construction of a (806.077 Fab-CPG2)₂ Fusion Protein CoexpressionVector for Use in Transient and Stable Cell Line Expression

For a simpler transfection methodology and the direct coupling of bothexpression cassettes to a single selection marker, a co-expressionvector for fusion protein expression was constructed using the existingvectors pNG4/VHss806.077VH-IgG2CH1/CPG2 (encoding the antibody Fd-CPG2fusion protein) and pNG3/VKss806.077VK-HuCK-NEO (encoding the antibodylight chain). The pNG4/VHss806.077VH-IgG2CH1/CPG2 plasmid was firstdigested with the restriction enzyme ScaI, the linear vector bandpurified, digested with the restriction enzymes BglII and BamHI and adesired band (approximately 2700 b.p.) purified. The plasmidpNG3/VKss806.077VK-HuCK-NEO was digested with the restriction enzymeBamHI after which the DNA was dephosphorylated and the vector bandpurified. The heavy chain expression cassette fragment was ligated in tothe prepared vector and the ligation mix transformed into E. coli. Theorientation was checked by a variety of restriction digests and clonesselected which had the heavy chain cassette in the same direction asthat of the light chain. This plasmid was termedpNG3-806.077-CPG2/R6-coexp.-NEO.

EXAMPLE 14

Construction of a (55.1 scFv-CPG2)₂ Fusion Protein

The 55.1 antibody, described in the U.S. Pat. No. 5,665,357, recognisesthe CA55.1 tumour associated antigen which is expressed on the majorityof colorectal tumours and is only weakly expressed or absent in normalcolonic tissue. The determination of the 55.1 heavy and light chain cDNAsequences is described in Example 3 of the aforementioned U.S. patent. Aplasmid expression vector allowing the secretion of antibody fragmentsinto the periplasm of E.coli utilizing a single pelB leader sequence(pICI266) has been deposited as accession number NCIMB 40589 on Oct. 11,1993 under the Budapest Treaty at the National Collections of Industrialand Marine Bacteria Limited (NCIMB), 23 St. Machar Drive, Aberdeen, AB21RY, Scotland, U.K. This vector was modified as described in Example3.3a of U.S. Pat. No. 5,665,357 to create pICI1646; this plasmid wasused for cloning of various 55.1 antibody fragments as described infurther subsections of Example 3, including the production of a 55.1scFv construct which was designated pICI1657.

The pICI1657 (otherwise known as pICI-55.1 scfv) was used as thestarting point for the construction of the (55.1 scFv-CPG2)₂ fusionprotein. The 55.1 scFv gene was amplified using the oligonucleotides CME3270 and CME 3272 (SEQ ID NOS: 27 and 28 respectively) and the plasmidpICI1657 as the template DNA. The resulting PCR product band of about790 b.p. was purified. Similarly the pNG4/A5B7VH-IgG2CH1/CPG2 R6 plasmiddescribed in Example 1e above was used as the template DNA in a standardPCR reaction to amplify the CPG2 gene using the oligonucleotide primersCME 3274 and CME 3275 (SEQ ID NOS: 29 and 30 respectively). The expectedPCR product band of about 1200 b.p. was purified.

A further PCR reaction was performed to join (or splice) the twopurified PCR reaction products together. Standard PCR reactionconditions were used using varying amounts (between 0.5 to 2 μl) of eachPCR product but utilising 25 cycles (instead of the usual 15 cycles)with the oligonucleotides CME 3270 and CME 3275 (SEQ ID NOS: 27 & 30). Areaction product of the expected size (approximately 2000 b.p.) wasexcised, purified and eluted in 20 μl H₂O, digested using therestriction enzyme EcoRI and purified. The vectorpNG4/VHss806.077VH-IgG2CH1/CPG2 was prepared to receive the above PCRproduct by digestion with restriction enzyme EcoRI, dephosphorylated,the larger vector band separated from the smaller fragment and purified.The similarly restricted PCR product was ligated in to the preparedvector and the ligation mix transformed into E. coli. DNA was preparedfrom the clones obtained and analysed by HindIII/NotI restrictiondigestion to check for correct fragment orientation and appropriateclones subsequently sequenced to confirm the fusion gene sequence. Anumber of the clones with the correct sequence were obtained and one ofthese clones was given the plasmid designation pNG4/55.1scFv/CPG2 R6.The DNA and amino acid sequences of the fusion protein are shown in SEQID NOS: 31 and 32.

EXAMPLE 15

Modification of the Plasmid pNG4/55.1scFv/CPG2 R6 to Facilitate scFvGene Exchange

During the construction of pNG4/55.1scFv/CPG2 R6 a unique BspEI(isoschizomer of AccIII) was introduced into the flexible (G₄S)₃ linkercoding sequence, situated between the antibody and CPG2 genes. Tofacilitate cloning of alternative scFv constructs the EcoRI site 3′ ofthe CPG2 gene in the pNG4/55.1scFv/CPG2 R6 was deleted in order toenable insertion of alternative scFv antibody genes in frame, bothbehind the plasmid signal sequence and 5′ of the CPG2 gene, via aEcoRI/BspEI fragment cloning. This modification was achieved by PCRmutagenesis in which first the pNG4/55.1scFv/CPG2 R6 was amplified usingoligonucleotides CME 3903 and CME 3906 (SEQ ID NOS: 33 and 34respectively). Secondly, the pNG4/55.1scFv/CPG2 R6 was again amplifiedbut using oligonucleotides CME 4040 and CME 3905 (SEQ ID NOS: 35 and 36respectively). The first expected PCR product band of about 420 b.p. waspurified. The second PCR reaction was similarly treated and the expectedPCR product band of about 450 b.p. purified.

A further PCR reaction was performed to join (or splice) the twopurified PCR reaction products together. Standard PCR reactionconditions were used using varying amounts (between 0.5 to 2 μl) of eachPCR product but utilising between 15 and 25 cycles with oligonucleotidesCME 3905 and CME 3906 (SEQ ID NOS: 36 & 34). A reaction product of theexpected size (approximately 840 b.p.) was purified, digested using therestriction enzymes NotI and XbaI and the expected fragment band ofca.460 b.p. was purified. The original pNG4/55.1scFv/CPG2 R6 wasprepared to receive the above PCR product by digestion with restrictionenzymes NotI and XbaI, dephosphorylated and the larger vector bandseparated from the smaller fragment. The vector band was purified andsubsequently the similarly restricted PCR product was ligated in to theprepared vector and the ligation mix transformed into E. coli. DNA wasprepared from the clones obtained and analysed by EcoRI restrictiondigestion to check for insertion of the modified fragment andappropriate clones subsequently sequenced to confirm the sequencechange. A number of clones with the correct sequence were obtained andone of these clones was given the plasmid designation pNG4/55.1scFv/CPG2R6/del EcoRI. This mutation removes the EcoRI site which was 3′ of theCPG2 gene and simultaneously introduces an additional stop codon. TheDNA sequence of the fusion protein gene up to, and including the twostop codons, are shown in SEQ ID NO: 37.

EXAMPLE 16

Construction of an 806.077 scFv Antibody Gene

The 806.077 scFv was created using vectors pNG4/VHss806.077VH-IgG2CH1′and pNG3/VKss806.077VK-HuCK-NEO which are sources for 806.077 VH and VKvariable region genes. The 806.077 VH gene was amplified from thepNG4/VHss806.077VH-IgG2CH1′ plasmid using standard PCR conditions withthe oligonucleotides CME 3260 and CME 3266 (SEQ ID NOS: 39 and 40respectively). The 806.077 VK was amplified from thepNG3/VKss806.077VK-HuCK-NEO plasmid using oligonucleotides CME 3262 andCME 3267 (SEQ ID NOS: 41 and 42 respectively). The VH and VK PCRreaction products were purified.

A further PCR reaction was performed to join (or splice) the twopurified PCR reaction products together. Standard PCR reactionconditions were used using varying amounts (between 0.5 to 2 μl) of eachPCR product but utilising between 15 and 25 cycles with the flankingoligonucleotides oligonucleotides CME 3260 and CME 3262 (SEQ ID NOS: 39& 41). A reaction product of the expected size (approximately 730 b.p.)was purified, digested using the restriction enzymes NcoI and XhoI andan expected fragment band of about 720 b.p. purified.

The pICI1657 plasmid (otherwise known as pICI-55.1 scFv) had beenfurther modified by the insertion of a double stranded DNA cassetteproduced from the two oligonucleotides CME 3143 and CME 3145 (SEQ IDNOS: 45 and 46) between the existing XhoI and EcoR restriction sites bystandard cloning techniques to create the vector pICI266-55.1 scFvtag/his (the DNA sequence of the resulting 55.1 scFv tag/his gene isshown in SEQ ID NO: 47). This vector was prepared to receive the abovePCR product by digestion with restriction enzymes NcoI and XhoI,dephosphorylated and the larger vector band separated from the smallerfragment. The vector band was purified and subsequently the similarlyrestricted PCR product was ligated in to the prepared vector and theligation mix transformed into E. coli. DNA was prepared from the clonesobtained and analysed by EcoRI restriction digestion to check forinsertion of the modified fragment and appropriate clones subsequentlysequenced to confirm the sequence change. A number of the clones withthe correct sequence were obtained and one of these clones was given theplasmid designation pICI266/806IscFvtag/his (alternatively known aspICI266-806VH/VLscFvtag/his). The DNA and protein sequences of the 806IscFvtag/his gene are shown in (SEQ ID NOS: 25 and 26).

EXAMPLE 17

Construction of an (806.077 scFv-CPG2)₂ Fusion Protein

The pICI266/806IscFvtag/his plasmid was used as the source for the806scFv. The gene was amplified using oligonucleotides CME 3907 and CME3908 (SEQ ID NOS: 48 and 49) and a band of the expected size purified.This fragment was then digested using the restriction enzymes EcoRI andBspEI after which an expected fragment band of about 760 b.p. waspurified.

The pNG4/55.1scFv/CPG2 R6/del EcoRI plasmid was prepared to receive theabove fragment by digestion with restriction enzymes EcoRI and BspEI,dephosphorylated and the larger vector band separated from the smallerfragment. The vector band was purified and subsequently the similarlyrestricted fragment ligated in to the prepared vector and the ligationmix was transformed into E. coli. DNA was prepared from the clonesobtained and analysed by EcoRI restriction digestion to check forinsertion of the modified fragment. Appropriate clones were subsequentlysequenced to confirm the gene sequence. A number of the clones with thecorrect sequence were obtained and one of these clones was given theplasmid designation pNG4/806IscFv/CPG2 R6/del EcoRI. The DNA and proteinsequence of the fusion protein gene 806IscFv/CPG2 R6 are shown in (SEQID NOS: 50 and 51).

EXAMPLE 18

Co-transfection, Transient Expression of Antibody-CPG2 Fusion Proteins

As described in Example 1f, plasmids encoding other fusion proteinvariants can be transfected using the given standard conditions in orderto obtain transient expression of their encoded fusion protein from COS7cells. In the case of(Fab-CPG2)₂ fusion proteins both co-transfection ofappropriate plasmids or transfection of co-expression proteins can beperformed. Similarly, the single expression plasmids of (scFv-CPG2)₂fusion proteins can be also be transfected by the same protocol. In eachcase a maximum total of 4 mg DNA are used in an individual transfection.

EXAMPLE 19

Gene Switches for Protein Expression

As described in Example 1j, the use of tightly controlled but induciblegene switch systems such as the “TET on” or “TET off” (Grossen, M. et al(1995) Science 268: 1766-1769) or the ecdysone/muristerone A (No, D. etal (1996) PNAS 93 :3346-3351) may be used for the expression of fusionproteins. Appropriate methodology and cloning strategies as described inExample 5 may be used for antibody Fab-enzyme fusions requiring an IRESsequence for expression. Insertion of the appropriate gene cassette into the switchable expression vectors may be used if the fusion proteinproduct is a single polypeptide chain such as in scFv-enzyme constructs.

EXAMPLE 20

Determination of the Properties of COS7 Cell Secreted Antibody-enzymeFusion Proteins

The COS7 cell supernatant material can be analysed for the presence ofantibody fusion proteins as described in Example 1g. Similarly the useof expressed fusion protein and CPG2 prodrug in an in vitro cytotoxicityassay can be performed as previously described in Example 1h. The HPLCbased enzyme activity assay can show CPG2 enzyme activity to be presentin the cell supernatant and anti-CEA ELISA can be detected with ananti-CPG2 reporter antibody to confirm binding of protein at levelscommensurate with a bivalent A5B7 antibody molecule and also todemonstrate that antibody-CPG2 fusion protein (not only just theantibody component) is binding CEA.

Western blot analysis with both reporter antibody assays clearly displaya fusion protein subunit of the expected size. Since CPG2 is only knownto exhibit enzyme activity when it is in a dimeric state it and sinceonly antibody enzyme fusion protein is present, this indicates that thefusion protein (seen under SDS/PAGE conditions) dimerises via thenatural CPG2 dimerisation mechanism to form a dimeric antibody-enzymefusion protein molecule in “native” buffer conditions. Furthermore, thismolecule exhibits both CPG2 enzymatic activity and CEA antigen bindingproperties which do not appear to be significantly different in thefusion protein compared with enzyme or antibody alone. Results obtainedfrom the cytotoxicity assay can demonstrate that antibody-enzyme fusionprotein (together with prodrug) causes at least equivalent cell kill andresulted in lower numbers of cells at the end of the assay period thanthe equivalent levels of A5B7 F(ab′)₂-CPG2 conjugate (with the sameprodrug). Since cell killing (above basal control levels) can only occurif the prodrug is converted to active drug by the CPG2 enzyme (and sincethe cells are washed to remove unbound protein, only cell bound enzymewill remain at the stage where the prodrug is added). Thus thisexperiment can demonstrate that at least as much of the (A5B7-CPG2 R6)₂fusion protein remains bound compared with conventional A5B7 F(ab)₂-CPG2conjugate as a greater degree of cell killing (presumably due to higherprodrug to drug conversion) occurs.

EXAMPLE 21

In vitro and in vivo Determination of the Properies of Antibody-enzymeFusion Proteins Expressed from Recombinant Tumour Cells

The construction of fusion protein expressing tumour cell lines can beperformed as described in Example 4.

Retention of the fusion protein on the cell surface of recombinant LoVotumour cells expressing antibody-enzyme fusion protein can be shownusing the techniques described in Example 7. Selective killing ofcultured LoVo tumour cells transfected with an antibody-CPG2 fusionprotein gene by a prodrug that is converted by the enzvme into an activedrug can be demonstrated as described in Example 8. Establishment ofantibody-enzyme fusion protein expressing LoVo tumours xenografts inathymic mice can be performed as described in Example 9. Determinationof enzyme activity in tumour xenograft samples can also be determined asdescribed in Example 10.

Determination enzyme activity in plasma samples performed as describedin Example 11. The anti-tumour activity of PGP prodrug in LoVo tumoursexpressing the antibody-CPG2 fusion protein can be evaluated using themethod described in Example 12.

The results from these experiments can be used to show that theantibody-CPG2 fusion protein secreted from CEA positive tumour celllines bind to the surface of the cells (via CEA) whereas the sameprotein expressed from CEA negative tumours shows no such binding. Theseresults can demonstrate that the transfected cells which express theantibody-CPG2 fusion protein can convert the PGP prodrug into the morepotent active drug while non-transfected LoVo cells are unable toconvert the prodrug. Consequently the transfected LoVo cells will beover 100 fold more sensitive to the PGP prodrug in terms of cell killingcompared to the non-transfected LoVo cells, thus demonstrating thattransfecting tumour cells with a gene for an antibody-enzyme fusionprotein can lead to selective tumour cell killing with a prodrug.

Administration of PGP to LoVo tumours established from recombinant LoVocells or recombinant Lovo/Parental LoVo cell mixes can result in asignificant anti-tumour effect as judged by the PGP treated tumoursdecreasing in size compared to the formulation buffer only treatedtumours and it taking a significantly longer time for the PGP treatedtumours to reach 4 times their initial tumour volume compared withformulation buffer treated tumours. In contrast, administration of PGPto LoVo tumours established from non-transfected cells would result inno significant anti-tumour activity.

Similar studies can be used to demonstrate that the antibody-enzyme genedelivered in an appropriate gene delivery vector to established LoVotumours produced from non-transfected parental LoVo cells when used incombination with the PGP prodrug can result in significant anti-tumouractivity. Thus non-transfected LoVo cells are injected into athymic nudemice (1×10 ⁷ tumour cells per mouse) and once the tumours are 5-7 mm indiameter the vector containing the antibody-enzyme fusion protein geneis injected intra-tumourally. After 1-7 days to allow theantibody-enzyme fusion protein to be expressed by, and bind to, the LoVotumour cells, the PGP prodrug is administered as previously described.This results in significant anti-tumour activity compared with controlmice receiving formulation buffer instead of PGP prodrug.

EXAMPLE 22

Preparation of (Murine A5B7 Fab-CPG2)₂ Fusion Protein

(Murine A5B7 Fab-CPG2)₂ is expressed from COS-7 and CHO cellsessentially as described in part (d) of Example 48 of InternationalPatent Application WO 97/42329 (Zeneca Limited, published Nov. 13, 1997)by cloning the genes for A5B7 light chain and A5B7 Fd linked at itsC-terminus via a flexible (G₄S)₃ peptide linker to CPG2 in the pEE14co-expression vector.

The murine A5B7 light chain is isolated from pAF8 (described in part gof Reference Example 5 in International Patent Application WO 96/20011,Zeneca Limited ). Plasmid pAF8 is cut with EcoRI and the resulting 732bp fragment isolated by electrophoresis on a 1% agarose gel. Thisfragment is cloned into pEE14 (described by Bebbington in METHODS: ACompanion to Methods in Enzymology (1991) 2, 136-145) similarly cut withEcoRI and the resulting plasmid used to transform E. coli strain DH5α.The transformed cells are plated onto L agar plus ampicillin (100μg/ml). A clone containing a plasmid with the correct sequence andorientation is confirmed by DNA sequence analysis (SEQ ID NO: 57) andthe plasmid named pEE14/ASB7muVkmuCK. The amino acid sequence of theencoded signal sequence (amino acid residues 1 to 22) and murine lightchain (amino acid residues 23 to 235) is shown in SEQ ID NO: 58.

The murine Fd-CPG2 gene is prepared from the R6 variant of the CPG2 gene(d of Example 1) and the murine A5B7 Fd sequence in pAF1 (described inpart d of Reference Example 5 in International Patent Application WO96/20011, Zeneca Limited ). A PCR reaction with oligonucleotides SEQ IDNOS: 53 and 54 on pAF1 gives a 247 bp fragment. This is cut with HindIIIand BamHI and cloned into similarly cut pUC19. The resulting plasmid isused to transform E. coli strain DH5a. The transformed cells are platedonto L agar plus ampicillin (100 μg/ml). A clone containing a plasmidwith the correct sequence is named pUC19/muCH1/NcoI-AccIII(Fd). A secondPCR with oligonucleotides SEQ ID NOS: 55 and 56 on pNG/VKss/CPG2/R6-neo(Example 1) gives a 265 bp fragment which is cut with HindIII and EcoRIand cloned into similarly cut pUC19 as above to give plasmidpUC19/muCH1-linker-CPG2/AccIII-SacII. PlasmidpUC19/muCH1/Ncol-AccIII(Fd) is cut with HindIII and AccIII and the 258bp fragment isolated by electrophoresis on a 1% agarose gel. Thisfragment is cloned into HindIII and AccII cutpUC19/muCH1-linker-CPG2/AccIII-SacII to give plasmidpUC19/muCH1-linker-CPG2/NcoI-SacII. A 956 bp fragment is isolated frompNG/VKss/CPG2/R6-neo by cutting it with SacII and EcoRI. This is clonedinto SacII and EcoRI cut pUC19/muCH1-linker-CPG2/NcoI-SacII to giveplasmid pUC19/muCH1-linker-RC/CPG2(R6). The complete gene construct isprepared by isolating a 498 bp HindIII to NcoI fragment from pAF1 andcloning it into HindIII and NcoI cut pUC19/muCH1-linker-RC/CPG2(R6). Theresulting plasmid is used to transform E. coli strain DH5α. Thetransformed cells are plated onto L agar plus ampicillin (100 μg/ml). Aclone containing a plasmid with the correct sequence and orientation isconfirmed by DNA sequence analysis (SEQ ID NO: 59) and the plasmid namedpUC19/muA5B7-RC/CPG2(R6). The amino acid sequence of the encoded signalsequence (amino acid residues 1 to 19) and murine Fd-linker-CPG2 (aminoacid residues 20 to 647) is shown in SEQ ID NO: 60. Alternatively, theCPG2 gene sequence described in Example 1 can be obtained by total genesynthesis and converted to the R6 variant as described in d ofExample 1. In this case, the base residue C at position 933 in SEQ IDNO: 59 is changed to G. The amino acid sequence of SEQ ID NO: 60 remainsunaltered.

For expression in the pEE14 vector, the gene is first cloned into pEE6(this is a derivative of pEE6.hCMV—Stephens and Cockett, 1989, NucleicAcids Research 17, 7110, in which a HindIII site upstream of the hCMVpromoter has been converted to a BgIII site). PlasmidpUC19/muA5B7-RC/CPG2(R6) is cut with HindIII and EcoRI and the 1974 bpfragment isolated by electrophoresis on a 1% agarose gel. This is clonedinto HindIII and EcoRl cut pEE6 in E. coli strain DH5α to give plasmidpEE6/muA5B7-RC/CPG2(R6). The pEE14 co-expression vector is made by firstcutting pEE6/muA5B7-RC/CPG2(R6) with BglII and BamHI and isolating the4320 bp fragment on a 1% agarose gel. This fragment is cloned into BglIIand BamHI cut pEE14/A5B7muVkmuCK. The resulting plasmid is used totransform E. coli strain DH5α. The transformed cells are plated onto Lagar plus ampicillin (100 μg/ml). A clone containing a plasmid with thecorrect sequence and orientation is confirmed by DNA sequence analysisand the plasmid named pEE14/muA5B7-RC/CPG2(R6).

For expression of (murine A5B7 Fab-CPG2)₂, plasmidpEE14/muA5B7-RC/CPG2(R6) is used to transfect COS-7 or CHO cells asdescribed in Example 48 of International Patent Application WO 97/42329,Zeneca Limited, published Nov. 13, 1997. COS cell supernatants and CHOclone supernatants are assayed for activity as described in Example 1and shown to have CEA binding and CPG2 enzyme activity.

EXAMPLE 23

Pharmaceutical Composition

The following illustrate a representative pharmaceutical dosage formcontaining a gene construct of the invention which may be used fortherapy in combination with a suitable prodrug.

A sterile aqueous solution, for injection either parenterally ordirectly into tumour tissue, containing 10⁷-10¹¹ adenovirus particlescomprising a gene construct as described in Example 1. After 3-7 days,three 1 g doses of prodrug are administered as sterile solutions athourly intervals. Prodrug is selected fromN-(4-[N,N-bis(2-iodoethyl)amino]-phenoxycarbonyl)-L-glutamic acid,N-(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic-gamma-(3,5-dicarboxy)anilideor N-(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic acidor a pharmaceutically acceptable salt thereof.

60 21 base pairs nucleic acid single linear other nucleic acid 1GGGAATTCCT CGAGGAGCTC C 21 27 base pairs nucleic acid single linearother nucleic acid 2 CCGGGGAGCT CCTCGAGGAA TTCCCGC 27 18 base pairsnucleic acid single linear other nucleic acid 3 CAGAAGCGCG ACAACGTG 1839 base pairs nucleic acid single linear other nucleic acid 4 CGAGGCCTTGCCGGTGATCT GGACCTGCAC GTAGGCGAT 39 63 base pairs nucleic acid singlelinear other nucleic acid 5 GGGGATGATG TTCGAGACCT GGCCGGCCTT GGCGATGGTCCACTGGAAGC GCAGGTTCTT 60 CGC 63 18 base pairs nucleic acid single linearother nucleic acid 6 CTTGCCGGCG CCCAGATC 18 18 base pairs nucleic acidsingle linear other nucleic acid 7 GTCTCGAACA TCATCCCC 18 18 base pairsnucleic acid single linear other nucleic acid 8 ATCACCGGCA AGGCCTCG 181236 base pairs nucleic acid single linear other nucleic acid 9ATGGATTTTC AAGTGCAGAT TTTCAGCTTC CTGCTAATCA GTGCTTCAGT CATAATGTCC 60CGCGGGCAGA AGCGCGACAA CGTGCTGTTC CAGGCAGCTA CCGACGAGCA GCCGGCCGTG 120ATCAAGACGC TGGAGAAGCT GGTCAACATC GAGACCGGCA CCGGTGACGC CGAGGGCATC 180GCCGCTGCGG GCAACTTCCT CGAGGCCGAG CTCAAGAACC TCGGCTTCAC GGTCACGCGA 240AGCAAGTCGG CCGGCCTGGT GGTGGGCGAC AACATCGTGG GCAAGATCAA GGGCCGCGGC 300GGCAAGAACC TGCTGCTGAT GTCGCACATG GACACCGTCT ACCTCAAGGG CATTCTCGCG 360AAGGCCCCGT TCCGCGTCGA AGGCGACAAG GCCTACGGCC CGGGCATCGC CGACGACAAG 420GGCGGCAACG CGGTCATCCT GCACACGCTC AAGCTGCTGA AGGAATACGG CGTGCGCGAC 480TACGGCACCA TCACCGTGCT GTTCAACACC GACGAGGAAA AGGGTTCCTT CGGCTCGCGC 540GACCTGATCC AGGAAGAAGC CAAGCTGGCC GACTACGTGC TCTCCTTCGA GCCCACCAGC 600GCAGGCGACG AAAAACTCTC GCTGGGCACC TCGGGCATCG CCTACGTGCA GGTCCAGATC 660ACCGGCAAGG CCTCGCATGC CGGCGCCGCG CCCGAGCTGG GCGTGAACGC GCTGGTCGAG 720GCTTCCGACC TCGTGCTGCG CACGATGAAC ATCGACGACA AGGCGAAGAA CCTGCGCTTC 780CAGTGGACCA TCGCCAAGGC CGGCCAGGTC TCGAACATCA TCCCCGCCAG CGCCACGCTG 840AACGCCGACG TGCGCTACGC GCGCAACGAG GACTTCGACG CCGCCATGAA GACGCTGGAA 900GAGCGCGCGC AGCAGAAGAA GCTGCCCGAG GCCGACGTGA AGGTGATCGT CACGCGCGGC 960CGCCCGGCCT TCAATGCCGG CGAAGGCGGC AAGAAGCTGG TCGACAAGGC GGTGGCCTAC 1020TACAAGGAAG CCGGCGGCAC GCTGGGCGTG GAAGAGCGCA CCGGCGGCGG CACCGACGCG 1080GCCTACGCCG CGCTCTCAGG CAAGCCAGTG ATCGAGAGCC TGGGCCTGCC GGGCTTCGGC 1140TACCACAGCG ACAAGGCCGA GTACGTGGAC ATCAGCGCGA TTCCGCGCCG CCTGTACATG 1200GCTGCGCGCC TGATCATGGA TCTGGGCGCC GGCAAG 1236 412 amino acids amino acidsingle linear protein 10 Met Asp Phe Gln Val Gln Ile Phe Ser Phe Leu LeuIle Ser Ala Ser 1 5 10 15 Val Ile Met Ser Arg Gly Gln Lys Arg Asp AsnVal Leu Phe Gln Ala 20 25 30 Ala Thr Asp Glu Gln Pro Ala Val Ile Lys ThrLeu Glu Lys Leu Val 35 40 45 Asn Ile Glu Thr Gly Thr Gly Asp Ala Glu GlyIle Ala Ala Ala Gly 50 55 60 Asn Phe Leu Glu Ala Glu Leu Lys Asn Leu GlyPhe Thr Val Thr Arg 65 70 75 80 Ser Lys Ser Ala Gly Leu Val Val Gly AspAsn Ile Val Gly Lys Ile 85 90 95 Lys Gly Arg Gly Gly Lys Asn Leu Leu LeuMet Ser His Met Asp Thr 100 105 110 Val Tyr Leu Lys Gly Ile Leu Ala LysAla Pro Phe Arg Val Glu Gly 115 120 125 Asp Lys Ala Tyr Gly Pro Gly IleAla Asp Asp Lys Gly Gly Asn Ala 130 135 140 Val Ile Leu His Thr Leu LysLeu Leu Lys Glu Tyr Gly Val Arg Asp 145 150 155 160 Tyr Gly Thr Ile ThrVal Leu Phe Asn Thr Asp Glu Glu Lys Gly Ser 165 170 175 Phe Gly Ser ArgAsp Leu Ile Gln Glu Glu Ala Lys Leu Ala Asp Tyr 180 185 190 Val Leu SerPhe Glu Pro Thr Ser Ala Gly Asp Glu Lys Leu Ser Leu 195 200 205 Gly ThrSer Gly Ile Ala Tyr Val Gln Val Gln Ile Thr Gly Lys Ala 210 215 220 SerHis Ala Gly Ala Ala Pro Glu Leu Gly Val Asn Ala Leu Val Glu 225 230 235240 Ala Ser Asp Leu Val Leu Arg Thr Met Asn Ile Asp Asp Lys Ala Lys 245250 255 Asn Leu Arg Phe Gln Trp Thr Ile Ala Lys Ala Gly Gln Val Ser Asn260 265 270 Ile Ile Pro Ala Ser Ala Thr Leu Asn Ala Asp Val Arg Tyr AlaArg 275 280 285 Asn Glu Asp Phe Asp Ala Ala Met Lys Thr Leu Glu Glu ArgAla Gln 290 295 300 Gln Lys Lys Leu Pro Glu Ala Asp Val Lys Val Ile ValThr Arg Gly 305 310 315 320 Arg Pro Ala Phe Asn Ala Gly Glu Gly Gly LysLys Leu Val Asp Lys 325 330 335 Ala Val Ala Tyr Tyr Lys Glu Ala Gly GlyThr Leu Gly Val Glu Glu 340 345 350 Arg Thr Gly Gly Gly Thr Asp Ala AlaTyr Ala Ala Leu Ser Gly Lys 355 360 365 Pro Val Ile Glu Ser Leu Gly LeuPro Gly Phe Gly Tyr His Ser Asp 370 375 380 Lys Ala Glu Tyr Val Asp IleSer Ala Ile Pro Arg Arg Leu Tyr Met 385 390 395 400 Ala Ala Arg Leu IleMet Asp Leu Gly Ala Gly Lys 405 410 21 base pairs nucleic acid singlelinear other nucleic acid 11 CCACTCTCAC AGTGAGCTCG G 21 55 base pairsnucleic acid single linear other nucleic acid 12 ACCGCTACCG CCACCACCAGAGCCACCACC GCCAACTGTC TTGTCCACCT TGGTG 55 18 base pairs nucleic acidsingle linear other nucleic acid 13 ACCCCCTCTA GAGTCGAC 18 54 base pairsnucleic acid single linear other nucleic acid 14 TCTGGTGGTG GCGGTAGCGGTGGCGGGGGT TCCCAGAAGC GCGACAACGT GCTG 54 1929 base pairs nucleic acidsingle linear other nucleic acid 15 ATGGAGTTGT GGCTGAACTG GATTTTCCTTGTAACACTTT TAAATGGTAT CCAGTGTGAG 60 GTGAAGCTGG TGGAGTCTGG AGGAGGCTTGGTACAGCCTG GGGGTTCTCT GAGACTCTCC 120 TGTGCAACTT CTGGGTTCAC CTTCACTGATTACTACATGA ACTGGGTCCG CCAGCCTCCA 180 GGAAAGGCAC TTGAGTGGTT GGGTTTTATTGGAAACAAAG CTAATGGTTA CACAACAGAG 240 TACAGTGCAT CTGTGAAGGG TCGGTTCACCATCTCCAGAG ATAAATCCCA AAGCATCCTC 300 TATCTTCAAA TGAACACCCT GAGAGCTGAGGACAGTGCCA CTTATTACTG TACAAGAGAT 360 AGGGGGCTAC GGTTCTACTT TGACTACTGGGGCCAAGGCA CCACTCTCAC AGTGAGCTCG 420 GCTAGCACCA AGGGACCATC GGTCTTCCCCCTGGCCCCCT GCTCCAGGAG CACCTCCGAG 480 AGCACAGCCG CCCTGGGCTG CCTGGTCAAGGACTACTTCC CCGAACCGGT GACGGTGTCG 540 TGGAACTCAG GCGCTCTGAC CAGCGGCGTGCACACCTTCC CGGCTGTCCT ACAGTCCTCA 600 GGACTCTACT CCCTCAGCAG CGTCGTGACGGTGCCCTCCA GCAACTTCGG CACCCAGACC 660 TACACCTGCA ACGTAGATCA CAAGCCCAGCAACACCAAGG TGGACAAGAC AGTTGGCGGT 720 GGTGGCTCTG GTGGTGGCGG TAGCGGTGGCGGGGGTTCCC AGAAGCGCGA CAACGTGCTG 780 TTCCAGGCAG CTACCGACGA GCAGCCGGCCGTGATCAAGA CGCTGGAGAA GCTGGTCAAC 840 ATCGAGACCG GCACCGGTGA CGCCGAGGGCATCGCCGCTG CGGGCAACTT CCTCGAGGCC 900 GAGCTCAAGA ACCTCGGCTT CACGGTCACGCGAAGCAAGT CGGCCGGCCT GGTGGTGGGC 960 GACAACATCG TGGGCAAGAT CAAGGGCCGCGGCGGCAAGA ACCTGCTGCT GATGTCGCAC 1020 ATGGACACCG TCTACCTCAA GGGCATTCTCGCGAAGGCCC CGTTCCGCGT CGAAGGCGAC 1080 AAGGCCTACG GCCCGGGCAT CGCCGACGACAAGGGCGGCA ACGCGGTCAT CCTGCACACG 1140 CTCAAGCTGC TGAAGGAATA CGGCGTGCGCGACTACGGCA CCATCACCGT GCTGTTCAAC 1200 ACCGACGAGG AAAAGGGTTC CTTCGGCTCGCGCGACCTGA TCCAGGAAGA AGCCAAGCTG 1260 GCCGACTACG TGCTCTCCTT CGAGCCCACCAGCGCAGGCG ACGAAAAACT CTCGCTGGGC 1320 ACCTCGGGCA TCGCCTACGT GCAGGTCCAGATCACCGGCA AGGCCTCGCA TGCCGGCGCC 1380 GCGCCCGAGC TGGGCGTGAA CGCGCTGGTCGAGGCTTCCG ACCTCGTGCT GCGCACGATG 1440 AACATCGACG ACAAGGCGAA GAACCTGCGCTTCCAGTGGA CCATCGCCAA GGCCGGCCAG 1500 GTCTCGAACA TCATCCCCGC CAGCGCCACGCTGAACGCCG ACGTGCGCTA CGCGCGCAAC 1560 GAGGACTTCG ACGCCGCCAT GAAGACGCTGGAAGAGCGCG CGCAGCAGAA GAAGCTGCCC 1620 GAGGCCGACG TGAAGGTGAT CGTCACGCGCGGCCGCCCGG CCTTCAATGC CGGCGAAGGC 1680 GGCAAGAAGC TGGTCGACAA GGCGGTGGCCTACTACAAGG AAGCCGGCGG CACGCTGGGC 1740 GTGGAAGAGC GCACCGGCGG CGGCACCGACGCGGCCTACG CCGCGCTCTC AGGCAAGCCA 1800 GTGATCGAGA GCCTGGGCCT GCCGGGCTTCGGCTACCACA GCGACAAGGC CGAGTACGTG 1860 GACATCAGCG CGATTCCGCG CCGCCTGTACATGGCTGCGC GCCTGATCAT GGATCTGGGC 1920 GCCGGCAAG 1929 643 amino acidsamino acid single linear protein 16 Met Glu Leu Trp Leu Asn Trp Ile PheLeu Val Thr Leu Leu Asn Gly 1 5 10 15 Ile Gln Cys Glu Val Lys Leu ValGlu Ser Gly Gly Gly Leu Val Gln 20 25 30 Pro Gly Gly Ser Leu Arg Leu SerCys Ala Thr Ser Gly Phe Thr Phe 35 40 45 Thr Asp Tyr Tyr Met Asn Trp ValArg Gln Pro Pro Gly Lys Ala Leu 50 55 60 Glu Trp Leu Gly Phe Ile Gly AsnLys Ala Asn Gly Tyr Thr Thr Glu 65 70 75 80 Tyr Ser Ala Ser Val Lys GlyArg Phe Thr Ile Ser Arg Asp Lys Ser 85 90 95 Gln Ser Ile Leu Tyr Leu GlnMet Asn Thr Leu Arg Ala Glu Asp Ser 100 105 110 Ala Thr Tyr Tyr Cys ThrArg Asp Arg Gly Leu Arg Phe Tyr Phe Asp 115 120 125 Tyr Trp Gly Gln GlyThr Thr Leu Thr Val Ser Ser Ala Ser Thr Lys 130 135 140 Gly Pro Ser ValPhe Pro Leu Ala Pro Cys Ser Arg Ser Thr Ser Glu 145 150 155 160 Ser ThrAla Ala Leu Gly Cys Leu Val Lys Asp Tyr Phe Pro Glu Pro 165 170 175 ValThr Val Ser Trp Asn Ser Gly Ala Leu Thr Ser Gly Val His Thr 180 185 190Phe Pro Ala Val Leu Gln Ser Ser Gly Leu Tyr Ser Leu Ser Ser Val 195 200205 Val Thr Val Pro Ser Ser Asn Phe Gly Thr Gln Thr Tyr Thr Cys Asn 210215 220 Val Asp His Lys Pro Ser Asn Thr Lys Val Asp Lys Thr Val Gly Gly225 230 235 240 Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser GlnLys Arg 245 250 255 Asp Asn Val Leu Phe Gln Ala Ala Thr Asp Glu Gln ProAla Val Ile 260 265 270 Lys Thr Leu Glu Lys Leu Val Asn Ile Glu Thr GlyThr Gly Asp Ala 275 280 285 Glu Gly Ile Ala Ala Ala Gly Asn Phe Leu GluAla Glu Leu Lys Asn 290 295 300 Leu Gly Phe Thr Val Thr Arg Ser Lys SerAla Gly Leu Val Val Gly 305 310 315 320 Asp Asn Ile Val Gly Lys Ile LysGly Arg Gly Gly Lys Asn Leu Leu 325 330 335 Leu Met Ser His Met Asp ThrVal Tyr Leu Lys Gly Ile Leu Ala Lys 340 345 350 Ala Pro Phe Arg Val GluGly Asp Lys Ala Tyr Gly Pro Gly Ile Ala 355 360 365 Asp Asp Lys Gly GlyAsn Ala Val Ile Leu His Thr Leu Lys Leu Leu 370 375 380 Lys Glu Tyr GlyVal Arg Asp Tyr Gly Thr Ile Thr Val Leu Phe Asn 385 390 395 400 Thr AspGlu Glu Lys Gly Ser Phe Gly Ser Arg Asp Leu Ile Gln Glu 405 410 415 GluAla Lys Leu Ala Asp Tyr Val Leu Ser Phe Glu Pro Thr Ser Ala 420 425 430Gly Asp Glu Lys Leu Ser Leu Gly Thr Ser Gly Ile Ala Tyr Val Gln 435 440445 Val Gln Ile Thr Gly Lys Ala Ser His Ala Gly Ala Ala Pro Glu Leu 450455 460 Gly Val Asn Ala Leu Val Glu Ala Ser Asp Leu Val Leu Arg Thr Met465 470 475 480 Asn Ile Asp Asp Lys Ala Lys Asn Leu Arg Phe Gln Trp ThrIle Ala 485 490 495 Lys Ala Gly Gln Val Ser Asn Ile Ile Pro Ala Ser AlaThr Leu Asn 500 505 510 Ala Asp Val Arg Tyr Ala Arg Asn Glu Asp Phe AspAla Ala Met Lys 515 520 525 Thr Leu Glu Glu Arg Ala Gln Gln Lys Lys LeuPro Glu Ala Asp Val 530 535 540 Lys Val Ile Val Thr Arg Gly Arg Pro AlaPhe Asn Ala Gly Glu Gly 545 550 555 560 Gly Lys Lys Leu Val Asp Lys AlaVal Ala Tyr Tyr Lys Glu Ala Gly 565 570 575 Gly Thr Leu Gly Val Glu GluArg Thr Gly Gly Gly Thr Asp Ala Ala 580 585 590 Tyr Ala Ala Leu Ser GlyLys Pro Val Ile Glu Ser Leu Gly Leu Pro 595 600 605 Gly Phe Gly Tyr HisSer Asp Lys Ala Glu Tyr Val Asp Ile Ser Ala 610 615 620 Ile Pro Arg ArgLeu Tyr Met Ala Ala Arg Leu Ile Met Asp Leu Gly 625 630 635 640 Ala GlyLys 705 base pairs nucleic acid single linear other nucleic acid 17ATGGATTTTC AAGTGCAGAT TTTCAGCTTC CTGCTAATCA GTGCTTCAGT CATAATGTCC 60AGAGGACAAA CTGTTCTCTC CCAGTCTCCA GCAATCCTGT CTGCATCTCC AGGGGAGAAG 120GTCACAATGA CTTGCAGGGC CAGCTCAAGT GTAACTTACA TTCACTGGTA CCAGCAGAAG 180CCAGGTTCCT CCCCCAAATC CTGGATTTAT GCCACATCCA ACCTGGCTTC TGGAGTCCCT 240GCTCGCTTCA GTGGCAGTGG GTCTGGGACC TCTTACTCTC TCACAATCAG CAGAGTGGAG 300GCTGAAGATG CTGCCACTTA TTACTGCCAA CATTGGAGTA GTAAACCACC GACGTTCGGT 360GGAGGCACCA AGCTCGAGAT CAAACGGACT GTGGCTGCAC CATCTGTCTT CATCTTCCCG 420CCATCTGATG AGCAGTTGAA ATCTGGAACT GCCTCTGTTG TGTGCCTGCT GAATAACTTC 480TATCCCAGAG AGGCCAAAGT ACAGTGGAAG GTGGATAACG CCCTCCAATC GGGTAACTCC 540CAGGAGAGTG TCACAGAGCA GGACAGCAAG GACAGCACCT ACAGCCTCAG CAGCACCCTG 600ACGCTGAGCA AAGCAGACTA CGAGAAACAC AAAGTCTACG CCTGCGAAGT CACCCATCAG 660GGCCTGAGTT CGCCCGTCAC AAAGAGCTTC AACAGGGGAG AGTGT 705 235 amino acidsamino acid single linear protein 18 Met Asp Phe Gln Val Gln Ile Phe SerPhe Leu Leu Ile Ser Ala Ser 1 5 10 15 Val Ile Met Ser Arg Gly Gln ThrVal Leu Ser Gln Ser Pro Ala Ile 20 25 30 Leu Ser Ala Ser Pro Gly Glu LysVal Thr Met Thr Cys Arg Ala Ser 35 40 45 Ser Ser Val Thr Tyr Ile His TrpTyr Gln Gln Lys Pro Gly Ser Ser 50 55 60 Pro Lys Ser Trp Ile Tyr Ala ThrSer Asn Leu Ala Ser Gly Val Pro 65 70 75 80 Ala Arg Phe Ser Gly Ser GlySer Gly Thr Ser Tyr Ser Leu Thr Ile 85 90 95 Ser Arg Val Glu Ala Glu AspAla Ala Thr Tyr Tyr Cys Gln His Trp 100 105 110 Ser Ser Lys Pro Pro ThrPhe Gly Gly Gly Thr Lys Leu Glu Ile Lys 115 120 125 Arg Thr Val Ala AlaPro Ser Val Phe Ile Phe Pro Pro Ser Asp Glu 130 135 140 Gln Leu Lys SerGly Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe 145 150 155 160 Tyr ProArg Glu Ala Lys Val Gln Trp Lys Val Asp Asn Ala Leu Gln 165 170 175 SerGly Asn Ser Gln Glu Ser Val Thr Glu Gln Asp Ser Lys Asp Ser 180 185 190Thr Tyr Ser Leu Ser Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu 195 200205 Lys His Lys Val Tyr Ala Cys Glu Val Thr His Gln Gly Leu Ser Ser 210215 220 Pro Val Thr Lys Ser Phe Asn Arg Gly Glu Cys 225 230 235 39 basepairs nucleic acid single linear other nucleic acid 19 AAGCTTGAATTCGCCGCCAC TATGGATTTT CAAGTGCAG 39 44 base pairs nucleic acid singlelinear other nucleic acid 20 TTAATTGGAT CCGAGCTCCT ATTAACACTC TCCCCTGTTGAAGC 44 50 base pairs nucleic acid single linear other nucleic acid 21AAGCTTCCGG ATCCCTGCAG CCATGGAGTT GTGGCTGAAC TGGATTTTCC 50 38 base pairsnucleic acid single linear other nucleic acid 22 AAGCTTAGTC TAGATTATCACTTGCCGGCG CCCAGATC 38 46 base pairs nucleic acid single linear othernucleic acid 23 CGGGGGATCC AGATCTGAGC TCCTGTAGAC GTCGACATTA ATTCCG 46 30base pairs nucleic acid single linear other nucleic acid 24 GGAAAATCCAGTTCAGCCAC AACTCCATGG 30 1926 base pairs nucleic acid single linearother nucleic acid 25 ATGAAGTTGT GGCTGAACTG GATTTTCCTT GTAACACTTTTAAATGGAAT TCAGTGTGAG 60 GTGCAGCTGC AGCAGTCTGG GGCAGAGCTT GTGAGGTCAGGGGCCTCAGT CAAGTTGTCC 120 TGCACAGCTT CTGGCTTCAA CATTAAAGAC AACTATATGCACTGGGTGAA GCAGAGGCCT 180 GAACAGGGCC TGGAGTGGAT TGCATGGATT GATCCTGAGAATGGTGATAC TGAATATGCC 240 CCGAAGTTCC GGGGCAAGGC CACTTTGACT GCAGACTCATCCTCCAACAC AGCCTACCTG 300 CACCTCAGCA GCCTGACATC TGAGGACACT GCCGTCTATTACTGTCATGT CCTGATCTAT 360 GCTGGTTATT TGGCTATGGA CTACTGGGGT CAAGGAACCTCAGTCGCCGT GAGCTCGGCT 420 AGCACCAAGG GACCATCGGT CTTCCCCCTG GCCCCCTGCTCCAGGAGCAC CTCCGAGAGC 480 ACAGCCGCCC TGGGCTGCCT GGTCAAGGAC TACTTCCCCGAACCGGTGAC GGTGTCGTGG 540 AACTCAGGCG CTCTGACCAG CGGCGTGCAC ACCTTCCCGGCTGTCCTACA GTCCTCAGGA 600 CTCTACTCCC TCAGCAGCGT CGTGACGGTG CCCTCCAGCAACTTCGGCAC CCAGACCTAC 660 ACCTGCAACG TAGATCACAA GCCCAGCAAC ACCAAGGTGGACAAGACAGT TGGCGGTGGT 720 GGCTCTGGTG GTGGCGGTAG CGGTGGCGGG GGTTCCCAGAAGCGCGACAA CGTGCTGTTC 780 CAGGCAGCTA CCGACGAGCA GCCGGCCGTG ATCAAGACGCTGGAGAAGCT GGTCAACATC 840 GAGACCGGCA CCGGTGACGC CGAGGGCATC GCCGCTGCGGGCAACTTCCT CGAGGCCGAG 900 CTCAAGAACC TCGGCTTCAC GGTCACGCGA AGCAAGTCGGCCGGCCTGGT GGTGGGCGAC 960 AACATCGTGG GCAAGATCAA GGGCCGCGGC GGCAAGAACCTGCTGCTGAT GTCGCACATG 1020 GACACCGTCT ACCTCAAGGG CATTCTCGCG AAGGCCCCGTTCCGCGTCGA AGGCGACAAG 1080 GCCTACGGCC CGGGCATCGC CGACGACAAG GGCGGCAACGCGGTCATCCT GCACACGCTC 1140 AAGCTGCTGA AGGAATACGG CGTGCGCGAC TACGGCACCATCACCGTGCT GTTCAACACC 1200 GACGAGGAAA AGGGTTCCTT CGGCTCGCGC GACCTGATCCAGGAAGAAGC CAAGCTGGCC 1260 GACTACGTGC TCTCCTTCGA GCCCACCAGC GCAGGCGACGAAAAACTCTC GCTGGGCACC 1320 TCGGGCATCG CCTACGTGCA GGTCCAGATC ACCGGCAAGGCCTCGCATGC CGGCGCCGCG 1380 CCCGAGCTGG GCGTGAACGC GCTGGTCGAG GCTTCCGACCTCGTGCTGCG CACGATGAAC 1440 ATCGACGACA AGGCGAAGAA CCTGCGCTTC CAGTGGACCATCGCCAAGGC CGGCCAGGTC 1500 TCGAACATCA TCCCCGCCAG CGCCACGCTG AACGCCGACGTGCGCTACGC GCGCAACGAG 1560 GACTTCGACG CCGCCATGAA GACGCTGGAA GAGCGCGCGCAGCAGAAGAA GCTGCCCGAG 1620 GCCGACGTGA AGGTGATCGT CACGCGCGGC CGCCCGGCCTTCAATGCCGG CGAAGGCGGC 1680 AAGAAGCTGG TCGACAAGGC GGTGGCCTAC TACAAGGAAGCCGGCGGCAC GCTGGGCGTG 1740 GAAGAGCGCA CCGGCGGCGG CACCGACGCG GCCTACGCCGCGCTCTCAGG CAAGCCAGTG 1800 ATCGAGAGCC TGGGCCTGCC GGGCTTCGGC TACCACAGCGACAAGGCCGA GTACGTGGAC 1860 ATCAGCGCGA TTCCGCGCCG CCTGTACATG GCTGCGCGCCTGATCATGGA TCTGGGCGCC 1920 GGCAAG 1926 642 amino acids amino acid singlelinear protein 26 Met Lys Leu Trp Leu Asn Trp Ile Phe Leu Val Thr LeuLeu Asn Gly 1 5 10 15 Ile Gln Cys Glu Val Gln Leu Gln Gln Ser Gly AlaGlu Leu Val Arg 20 25 30 Ser Gly Ala Ser Val Lys Leu Ser Cys Thr Ala SerGly Phe Asn Ile 35 40 45 Lys Asp Asn Tyr Met His Trp Val Lys Gln Arg ProGlu Gln Gly Leu 50 55 60 Glu Trp Ile Ala Trp Ile Asp Pro Glu Asn Gly AspThr Glu Tyr Ala 65 70 75 80 Pro Lys Phe Arg Gly Lys Ala Thr Leu Thr AlaAsp Ser Ser Ser Asn 85 90 95 Thr Ala Tyr Leu His Leu Ser Ser Leu Thr SerGlu Asp Thr Ala Val 100 105 110 Tyr Tyr Cys His Val Leu Ile Tyr Ala GlyTyr Leu Ala Met Asp Tyr 115 120 125 Trp Gly Gln Gly Thr Ser Val Ala ValSer Ser Ala Ser Thr Lys Gly 130 135 140 Pro Ser Val Phe Pro Leu Ala ProCys Ser Arg Ser Thr Ser Glu Ser 145 150 155 160 Thr Ala Ala Leu Gly CysLeu Val Lys Asp Tyr Phe Pro Glu Pro Val 165 170 175 Thr Val Ser Trp AsnSer Gly Ala Leu Thr Ser Gly Val His Thr Phe 180 185 190 Pro Ala Val LeuGln Ser Ser Gly Leu Tyr Ser Leu Ser Ser Val Val 195 200 205 Thr Val ProSer Ser Asn Phe Gly Thr Gln Thr Tyr Thr Cys Asn Val 210 215 220 Asp HisLys Pro Ser Asn Thr Lys Val Asp Lys Thr Val Gly Gly Gly 225 230 235 240Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gln Lys Arg Asp 245 250255 Asn Val Leu Phe Gln Ala Ala Thr Asp Glu Gln Pro Ala Val Ile Lys 260265 270 Thr Leu Glu Lys Leu Val Asn Ile Glu Thr Gly Thr Gly Asp Ala Glu275 280 285 Gly Ile Ala Ala Ala Gly Asn Phe Leu Glu Ala Glu Leu Lys AsnLeu 290 295 300 Gly Phe Thr Val Thr Arg Ser Lys Ser Ala Gly Leu Val ValGly Asp 305 310 315 320 Asn Ile Val Gly Lys Ile Lys Gly Arg Gly Gly LysAsn Leu Leu Leu 325 330 335 Met Ser His Met Asp Thr Val Tyr Leu Lys GlyIle Leu Ala Lys Ala 340 345 350 Pro Phe Arg Val Glu Gly Asp Lys Ala TyrGly Pro Gly Ile Ala Asp 355 360 365 Asp Lys Gly Gly Asn Ala Val Ile LeuHis Thr Leu Lys Leu Leu Lys 370 375 380 Glu Tyr Gly Val Arg Asp Tyr GlyThr Ile Thr Val Leu Phe Asn Thr 385 390 395 400 Asp Glu Glu Lys Gly SerPhe Gly Ser Arg Asp Leu Ile Gln Glu Glu 405 410 415 Ala Lys Leu Ala AspTyr Val Leu Ser Phe Glu Pro Thr Ser Ala Gly 420 425 430 Asp Glu Lys LeuSer Leu Gly Thr Ser Gly Ile Ala Tyr Val Gln Val 435 440 445 Gln Ile ThrGly Lys Ala Ser His Ala Gly Ala Ala Pro Glu Leu Gly 450 455 460 Val AsnAla Leu Val Glu Ala Ser Asp Leu Val Leu Arg Thr Met Asn 465 470 475 480Ile Asp Asp Lys Ala Lys Asn Leu Arg Phe Gln Trp Thr Ile Ala Lys 485 490495 Ala Gly Gln Val Ser Asn Ile Ile Pro Ala Ser Ala Thr Leu Asn Ala 500505 510 Asp Val Arg Tyr Ala Arg Asn Glu Asp Phe Asp Ala Ala Met Lys Thr515 520 525 Leu Glu Glu Arg Ala Gln Gln Lys Lys Leu Pro Glu Ala Asp ValLys 530 535 540 Val Ile Val Thr Arg Gly Arg Pro Ala Phe Asn Ala Gly GluGly Gly 545 550 555 560 Lys Lys Leu Val Asp Lys Ala Val Ala Tyr Tyr LysGlu Ala Gly Gly 565 570 575 Thr Leu Gly Val Glu Glu Arg Thr Gly Gly GlyThr Asp Ala Ala Tyr 580 585 590 Ala Ala Leu Ser Gly Lys Pro Val Ile GluSer Leu Gly Leu Pro Gly 595 600 605 Phe Gly Tyr His Ser Asp Lys Ala GluTyr Val Asp Ile Ser Ala Ile 610 615 620 Pro Arg Arg Leu Tyr Met Ala AlaArg Leu Ile Met Asp Leu Gly Ala 625 630 635 640 Gly Lys 39 base pairsnucleic acid single linear other nucleic acid 27 AAGCTTGGAA TTCAGTGTCAGGTCCAACTG CAGCAGCCT 39 54 base pairs nucleic acid single linear othernucleic acid 28 GCTACCGCCA CCTCCGGAGC CACCACCGCC CCGTTTGATC TCGAGCTTGGTGCC 54 58 base pairs nucleic acid single linear other nucleic acid 29TCCGGAGGTG GCGGTAGCGG TGGCGGGGGT TCCCAGAAGC GCGACAACGT GCTGTTCC 58 24base pairs nucleic acid single linear other nucleic acid 30 CCTCGAGGAATTCTTTCACT TGCC 24 2019 base pairs nucleic acid single linear othernucleic acid 31 ATGAAGTTGT GGCTGAACTG GATTTTCCTT GTAACACTTT TAAATGGAATTCAGTGTCAG 60 GTCCAACTGC AGCAGCCTGG GGCTGAACTG GTGAAGCCTG GGGCTTCAGTGCAGCTGTCC 120 TGCAAGGCTT CTGGCTACAC CTTCACCGGC TACTGGATAC ACTGGGTGAAGCAGAGGCCT 180 GGACAAGGCC TTGAGTGGAT TGGAGAGGTT AATCCTAGTA CCGGTCGTTCTGACTACAAT 240 GAGAAGTTCA AGAACAAGGC CACACTGACT GTAGACAAAT CCTCCACCACAGCCTACATG 300 CAACTCAGCA GCCTGACATC TGAGGACTCT GCGGTCTATT ACTGTGCAAGAGAGAGGGCC 360 TATGGTTACG ACGATGCTAT GGACTACTGG GGCCAAGGGA CCACGGTCACCGTCTCCTCA 420 GGTGGCGGTG GCTCGGGCGG TGGTGGGTCG GGTGGCGGCG GATCTGACATTGAGCTCTCA 480 CAGTCTCCAT CCTCCCTGGC TGTGTCAGCA GGAGAGAAGG TCACCATGAGCTGCAAATCC 540 AGTCAGAGTC TCCTCAACAG TAGAACCCGA AAGAACTACT TGGCTTGGTACCAGCAGAGA 600 CCAGGGCAGT CTCCTAAACT GCTGATCTAT TGGGCATCCA CTAGGACATCTGGGGTCCCT 660 GATCGCTTCA CAGGCAGTGG ATCTGGGACA GATTTCACTC TCACCATCAGCAGTGTGCAG 720 GCTGAAGACC TGGCAATTTA TTACTGCAAG CAATCTTATA CTCTTCGGACGTTCGGTGGA 780 GGCACCAAGC TCGAGATCAA ACGGGGCGGT GGTGGCTCCG GAGGTGGCGGTAGCGGTGGC 840 GGGGGTTCCC AGAAGCGCGA CAACGTGCTG TTCCAGGCAG CTACCGACGAGCAGCCGGCC 900 GTGATCAAGA CGCTGGAGAA GCTGGTCAAC ATCGAGACCG GCACCGGTGACGCCGAGGGC 960 ATCGCCGCTG CGGGCAACTT CCTCGAGGCC GAGCTCAAGA ACCTCGGCTTCACGGTCACG 1020 CGAAGCAAGT CGGCCGGCCT GGTGGTGGGC GACAACATCG TGGGCAAGATCAAGGGCCGC 1080 GGCGGCAAGA ACCTGCTGCT GATGTCGCAC ATGGACACCG TCTACCTCAAGGGCATTCTC 1140 GCGAAGGCCC CGTTCCGCGT CGAAGGCGAC AAGGCCTACG GCCCGGGCATCGCCGACGAC 1200 AAGGGCGGCA ACGCGGTCAT CCTGCACACG CTCAAGCTGC TGAAGGAATACGGCGTGCGC 1260 GACTACGGCA CCATCACCGT GCTGTTCAAC ACCGACGAGG AAAAGGGTTCCTTCGGCTCG 1320 CGCGACCTGA TCCAGGAAGA AGCCAAGCTG GCCGACTACG TGCTCTCCTTCGAGCCCACC 1380 AGCGCAGGCG ACGAAAAACT CTCGCTGGGC ACCTCGGGCA TCGCCTACGTGCAGGTCCAG 1440 ATCACCGGCA AGGCCTCGCA TGCCGGCGCC GCGCCCGAGC TGGGCGTGAACGCGCTGGTC 1500 GAGGCTTCCG ACCTCGTGCT GCGCACGATG AACATCGACG ACAAGGCGAAGAACCTGCGC 1560 TTCCAGTGGA CCATCGCCAA GGCCGGCCAG GTCTCGAACA TCATCCCCGCCAGCGCCACG 1620 CTGAACGCCG ACGTGCGCTA CGCGCGCAAC GAGGACTTCG ACGCCGCCATGAAGACGCTG 1680 GAAGAGCGCG CGCAGCAGAA GAAGCTGCCC GAGGCCGACG TGAAGGTGATCGTCACGCGC 1740 GGCCGCCCGG CCTTCAATGC CGGCGAAGGC GGCAAGAAGC TGGTCGACAAGGCGGTGGCC 1800 TACTACAAGG AAGCCGGCGG CACGCTGGGC GTGGAAGAGC GCACCGGCGGCGGCACCGAC 1860 GCGGCCTACG CCGCGCTCTC AGGCAAGCCA GTGATCGAGA GCCTGGGCCTGCCGGGCTTC 1920 GGCTACCACA GCGACAAGGC CGAGTACGTG GACATCAGCG CGATTCCGCGCCGCCTGTAC 1980 ATGGCTGCGC GCCTGATCAT GGATCTGGGC GCCGGCAAG 2019 673amino acids amino acid single linear protein 32 Met Lys Leu Trp Leu AsnTrp Ile Phe Leu Val Thr Leu Leu Asn Gly 1 5 10 15 Ile Gln Cys Gln ValGln Leu Gln Gln Pro Gly Ala Glu Leu Val Lys 20 25 30 Pro Gly Ala Ser ValGln Leu Ser Cys Lys Ala Ser Gly Tyr Thr Phe 35 40 45 Thr Gly Tyr Trp IleHis Trp Val Lys Gln Arg Pro Gly Gln Gly Leu 50 55 60 Glu Trp Ile Gly GluVal Asn Pro Ser Thr Gly Arg Ser Asp Tyr Asn 65 70 75 80 Glu Lys Phe LysAsn Lys Ala Thr Leu Thr Val Asp Lys Ser Ser Thr 85 90 95 Thr Ala Tyr MetGln Leu Ser Ser Leu Thr Ser Glu Asp Ser Ala Val 100 105 110 Tyr Tyr CysAla Arg Glu Arg Ala Tyr Gly Tyr Asp Asp Ala Met Asp 115 120 125 Tyr TrpGly Gln Gly Thr Thr Val Thr Val Ser Ser Gly Gly Gly Gly 130 135 140 SerGly Gly Gly Gly Ser Gly Gly Gly Gly Ser Asp Ile Glu Leu Ser 145 150 155160 Gln Ser Pro Ser Ser Leu Ala Val Ser Ala Gly Glu Lys Val Thr Met 165170 175 Ser Cys Lys Ser Ser Gln Ser Leu Leu Asn Ser Arg Thr Arg Lys Asn180 185 190 Tyr Leu Ala Trp Tyr Gln Gln Arg Pro Gly Gln Ser Pro Lys LeuLeu 195 200 205 Ile Tyr Trp Ala Ser Thr Arg Thr Ser Gly Val Pro Asp ArgPhe Thr 210 215 220 Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr Ile SerSer Val Gln 225 230 235 240 Ala Glu Asp Leu Ala Ile Tyr Tyr Cys Lys GlnSer Tyr Thr Leu Arg 245 250 255 Thr Phe Gly Gly Gly Thr Lys Leu Glu IleLys Arg Gly Gly Gly Gly 260 265 270 Ser Gly Gly Gly Gly Ser Gly Gly GlyGly Ser Gln Lys Arg Asp Asn 275 280 285 Val Leu Phe Gln Ala Ala Thr AspGlu Gln Pro Ala Val Ile Lys Thr 290 295 300 Leu Glu Lys Leu Val Asn IleGlu Thr Gly Thr Gly Asp Ala Glu Gly 305 310 315 320 Ile Ala Ala Ala GlyAsn Phe Leu Glu Ala Glu Leu Lys Asn Leu Gly 325 330 335 Phe Thr Val ThrArg Ser Lys Ser Ala Gly Leu Val Val Gly Asp Asn 340 345 350 Ile Val GlyLys Ile Lys Gly Arg Gly Gly Lys Asn Leu Leu Leu Met 355 360 365 Ser HisMet Asp Thr Val Tyr Leu Lys Gly Ile Leu Ala Lys Ala Pro 370 375 380 PheArg Val Glu Gly Asp Lys Ala Tyr Gly Pro Gly Ile Ala Asp Asp 385 390 395400 Lys Gly Gly Asn Ala Val Ile Leu His Thr Leu Lys Leu Leu Lys Glu 405410 415 Tyr Gly Val Arg Asp Tyr Gly Thr Ile Thr Val Leu Phe Asn Thr Asp420 425 430 Glu Glu Lys Gly Ser Phe Gly Ser Arg Asp Leu Ile Gln Glu GluAla 435 440 445 Lys Leu Ala Asp Tyr Val Leu Ser Phe Glu Pro Thr Ser AlaGly Asp 450 455 460 Glu Lys Leu Ser Leu Gly Thr Ser Gly Ile Ala Tyr ValGln Val Gln 465 470 475 480 Ile Thr Gly Lys Ala Ser His Ala Gly Ala AlaPro Glu Leu Gly Val 485 490 495 Asn Ala Leu Val Glu Ala Ser Asp Leu ValLeu Arg Thr Met Asn Ile 500 505 510 Asp Asp Lys Ala Lys Asn Leu Arg PheGln Trp Thr Ile Ala Lys Ala 515 520 525 Gly Gln Val Ser Asn Ile Ile ProAla Ser Ala Thr Leu Asn Ala Asp 530 535 540 Val Arg Tyr Ala Arg Asn GluAsp Phe Asp Ala Ala Met Lys Thr Leu 545 550 555 560 Glu Glu Arg Ala GlnGln Lys Lys Leu Pro Glu Ala Asp Val Lys Val 565 570 575 Ile Val Thr ArgGly Arg Pro Ala Phe Asn Ala Gly Glu Gly Gly Lys 580 585 590 Lys Leu ValAsp Lys Ala Val Ala Tyr Tyr Lys Glu Ala Gly Gly Thr 595 600 605 Leu GlyVal Glu Glu Arg Thr Gly Gly Gly Thr Asp Ala Ala Tyr Ala 610 615 620 AlaLeu Ser Gly Lys Pro Val Ile Glu Ser Leu Gly Leu Pro Gly Phe 625 630 635640 Gly Tyr His Ser Asp Lys Ala Glu Tyr Val Asp Ile Ser Ala Ile Pro 645650 655 Arg Arg Leu Tyr Met Ala Ala Arg Leu Ile Met Asp Leu Gly Ala Gly660 665 670 Lys 37 base pairs nucleic acid single linear other nucleicacid 33 GGGCGCCGGC AAGTGATAAA ATTCCTCGAG GAGCTCC 37 19 base pairsnucleic acid single linear other nucleic acid 34 CGCCACCTCT GACTTGAGC 1937 base pairs nucleic acid single linear other nucleic acid 35GGAGCTCCTC GAGGAATTTT ATCACTTGCC GGCGCCC 37 19 base pairs nucleic acidsingle linear other nucleic acid 36 GCTGAACGCC GACGTGCGC 19 2025 basepairs nucleic acid single linear other nucleic acid 37 ATGAAGTTGTGGCTGAACTG GATTTTCCTT GTAACACTTT TAAATGGAAT TCAGTGTCAG 60 GTCCAACTGCAGCAGCCTGG GGCTGAACTG GTGAAGCCTG GGGCTTCAGT GCAGCTGTCC 120 TGCAAGGCTTCTGGCTACAC CTTCACCGGC TACTGGATAC ACTGGGTGAA GCAGAGGCCT 180 GGACAAGGCCTTGAGTGGAT TGGAGAGGTT AATCCTAGTA CCGGTCGTTC TGACTACAAT 240 GAGAAGTTCAAGAACAAGGC CACACTGACT GTAGACAAAT CCTCCACCAC AGCCTACATG 300 CAACTCAGCAGCCTGACATC TGAGGACTCT GCGGTCTATT ACTGTGCAAG AGAGAGGGCC 360 TATGGTTACGACGATGCTAT GGACTACTGG GGCCAAGGGA CCACGGTCAC CGTCTCCTCA 420 GGTGGCGGTGGCTCGGGCGG TGGTGGGTCG GGTGGCGGCG GATCTGACAT TGAGCTCTCA 480 CAGTCTCCATCCTCCCTGGC TGTGTCAGCA GGAGAGAAGG TCACCATGAG CTGCAAATCC 540 AGTCAGAGTCTCCTCAACAG TAGAACCCGA AAGAACTACT TGGCTTGGTA CCAGCAGAGA 600 CCAGGGCAGTCTCCTAAACT GCTGATCTAT TGGGCATCCA CTAGGACATC TGGGGTCCCT 660 GATCGCTTCACAGGCAGTGG ATCTGGGACA GATTTCACTC TCACCATCAG CAGTGTGCAG 720 GCTGAAGACCTGGCAATTTA TTACTGCAAG CAATCTTATA CTCTTCGGAC GTTCGGTGGA 780 GGCACCAAGCTCGAGATCAA ACGGGGCGGT GGTGGCTCCG GAGGTGGCGG TAGCGGTGGC 840 GGGGGTTCCCAGAAGCGCGA CAACGTGCTG TTCCAGGCAG CTACCGACGA GCAGCCGGCC 900 GTGATCAAGACGCTGGAGAA GCTGGTCAAC ATCGAGACCG GCACCGGTGA CGCCGAGGGC 960 ATCGCCGCTGCGGGCAACTT CCTCGAGGCC GAGCTCAAGA ACCTCGGCTT CACGGTCACG 1020 CGAAGCAAGTCGGCCGGCCT GGTGGTGGGC GACAACATCG TGGGCAAGAT CAAGGGCCGC 1080 GGCGGCAAGAACCTGCTGCT GATGTCGCAC ATGGACACCG TCTACCTCAA GGGCATTCTC 1140 GCGAAGGCCCCGTTCCGCGT CGAAGGCGAC AAGGCCTACG GCCCGGGCAT CGCCGACGAC 1200 AAGGGCGGCAACGCGGTCAT CCTGCACACG CTCAAGCTGC TGAAGGAATA CGGCGTGCGC 1260 GACTACGGCACCATCACCGT GCTGTTCAAC ACCGACGAGG AAAAGGGTTC CTTCGGCTCG 1320 CGCGACCTGATCCAGGAAGA AGCCAAGCTG GCCGACTACG TGCTCTCCTT CGAGCCCACC 1380 AGCGCAGGCGACGAAAAACT CTCGCTGGGC ACCTCGGGCA TCGCCTACGT GCAGGTCCAG 1440 ATCACCGGCAAGGCCTCGCA TGCCGGCGCC GCGCCCGAGC TGGGCGTGAA CGCGCTGGTC 1500 GAGGCTTCCGACCTCGTGCT GCGCACGATG AACATCGACG ACAAGGCGAA GAACCTGCGC 1560 TTCCAGTGGACCATCGCCAA GGCCGGCCAG GTCTCGAACA TCATCCCCGC CAGCGCCACG 1620 CTGAACGCCGACGTGCGCTA CGCGCGCAAC GAGGACTTCG ACGCCGCCAT GAAGACGCTG 1680 GAAGAGCGCGCGCAGCAGAA GAAGCTGCCC GAGGCCGACG TGAAGGTGAT CGTCACGCGC 1740 GGCCGCCCGGCCTTCAATGC CGGCGAAGGC GGCAAGAAGC TGGTCGACAA GGCGGTGGCC 1800 TACTACAAGGAAGCCGGCGG CACGCTGGGC GTGGAAGAGC GCACCGGCGG CGGCACCGAC 1860 GCGGCCTACGCCGCGCTCTC AGGCAAGCCA GTGATCGAGA GCCTGGGCCT GCCGGGCTTC 1920 GGCTACCACAGCGACAAGGC CGAGTACGTG GACATCAGCG CGATTCCGCG CCGCCTGTAC 1980 ATGGCTGCGCGCCTGATCAT GGATCTGGGC GCCGGCAAGT GATAA 2025 288 amino acids amino acidsingle linear protein 38 Met Lys Tyr Leu Leu Pro Thr Ala Ala Ala Gly LeuLeu Leu Leu Ala 1 5 10 15 Ala Gln Pro Ala Met Ala Gln Val Gln Leu GlnGln Pro Gly Ala Glu 20 25 30 Leu Val Lys Pro Gly Ala Ser Val Gln Leu SerCys Lys Ala Ser Gly 35 40 45 Tyr Thr Phe Thr Gly Tyr Trp Ile His Trp ValLys Gln Arg Pro Gly 50 55 60 Gln Gly Leu Glu Trp Ile Gly Glu Val Asn ProSer Thr Gly Arg Ser 65 70 75 80 Asp Tyr Asn Glu Lys Phe Lys Asn Lys AlaThr Leu Thr Val Asp Lys 85 90 95 Ser Ser Thr Thr Ala Tyr Met Gln Leu SerSer Leu Thr Ser Glu Asp 100 105 110 Ser Ala Val Tyr Tyr Cys Ala Arg GluArg Ala Tyr Gly Tyr Asp Asp 115 120 125 Ala Met Asp Tyr Trp Gly Gln GlyThr Thr Val Thr Val Ser Ser Gly 130 135 140 Gly Gly Gly Ser Gly Gly GlyGly Ser Gly Gly Gly Gly Ser Asp Ile 145 150 155 160 Glu Leu Ser Gln SerPro Ser Ser Leu Ala Val Ser Ala Gly Glu Lys 165 170 175 Val Thr Met SerCys Lys Ser Ser Gln Ser Leu Leu Asn Ser Arg Thr 180 185 190 Arg Lys AsnTyr Leu Ala Trp Tyr Gln Gln Arg Pro Gly Gln Ser Pro 195 200 205 Lys LeuLeu Ile Tyr Trp Ala Ser Thr Arg Thr Ser Gly Val Pro Asp 210 215 220 ArgPhe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr Ile Ser 225 230 235240 Ser Val Gln Ala Glu Asp Leu Ala Ile Tyr Tyr Cys Lys Gln Ser Tyr 245250 255 Thr Leu Arg Thr Phe Gly Gly Gly Thr Lys Leu Glu Ile Lys Arg Glu260 265 270 Gln Lys Leu Ile Ser Glu Glu Asp Leu Asn His His His His HisHis 275 280 285 36 base pairs nucleic acid single linear other nucleicacid 39 GCCCAACCAG CCATGGCCGA GGTGCAGCTG CAGCAG 36 54 base pairs nucleicacid single linear other nucleic acid 40 CGACCCACCA CCGCCCGAGCCACCGCCACC CGAGCTCACG GCGACTGAGG TTCC 54 54 base pairs nucleic acidsingle linear other nucleic acid 41 TCGGGCGGTG GTGGGTCGGG TGGCGGCGGATCTCAGATTG TGCTCACCCA GTCT 54 24 base pairs nucleic acid single linearother nucleic acid 42 CCGTTTGATC TCGAGCTTGG TCCC 24 843 base pairsnucleic acid single linear other nucleic acid 43 ATGAAATACC TATTGCCTACGGCAGCCGCT GGATTGTTAT TACTCGCTGC CCAACCAGCC 60 ATGGCCGAGG TGCAGCTGCAGCAGTCTGGG GCAGAGCTTG TGAGGTCAGG GGCCTCAGTC 120 AAGTTGTCCT GCACAGCTTCTGGCTTCAAC ATTAAAGACA ACTATATGCA CTGGGTGAAG 180 CAGAGGCCTG AACAGGGCCTGGAGTGGATT GCATGGATTG ATCCTGAGAA TGGTGATACT 240 GAATATGCCC CGAAGTTCCGGGGCAAGGCC ACTTTGACTG CAGACTCATC CTCCAACACA 300 GCCTACCTGC ACCTCAGCAGCCTGACATCT GAGGACACTG CCGTCTATTA CTGTCATGTC 360 CTGATCTATG CTGGTTATTTGGCTATGGAC TACTGGGGTC AAGGAACCTC AGTCGCCGTG 420 AGCTCGGGTG GCGGTGGCTCGGGCGGTGGT GGGTCGGGTG GCGGCGGATC TCAGATTGTG 480 CTCACCCAGT CTCCAGCAATCATGTCTGCA TCTCCAGGGG AGAAGGTCAC CATAACCTGC 540 AGTGCCAGCT CAAGTGTAACTTACATGCAC TGGTTCCAGC AGAAGCCAGG CACTTCTCCC 600 AAACTCTGGA TTTATAGCACATCCAACCTG GCTTCTGGAG TCCCTGCTCG CTTCAGTGGC 660 AGTGGATCTG GGACCTCTTACTCTCTCACA ATCAGCCGAA TGGAGGCTGA AGATGCTGCC 720 ACTTATTACT GCCAGCAAAGGAGTACTTAC CCGCTCACGT TCGGTGCTGG GACCAAGCTC 780 GAGATCAAAC GGGAACAAAAACTCATCTCA GAAGAAGATC TGAATCACCA CCATCACCAC 840 CAT 843 281 amino acidsamino acid single linear protein 44 Met Lys Tyr Leu Leu Pro Thr Ala AlaAla Gly Leu Leu Leu Leu Ala 1 5 10 15 Ala Gln Pro Ala Met Ala Glu ValGln Leu Gln Gln Ser Gly Ala Glu 20 25 30 Leu Val Arg Ser Gly Ala Ser ValLys Leu Ser Cys Thr Ala Ser Gly 35 40 45 Phe Asn Ile Lys Asp Asn Tyr MetHis Trp Val Lys Gln Arg Pro Glu 50 55 60 Gln Gly Leu Glu Trp Ile Ala TrpIle Asp Pro Glu Asn Gly Asp Thr 65 70 75 80 Glu Tyr Ala Pro Lys Phe ArgGly Lys Ala Thr Leu Thr Ala Asp Ser 85 90 95 Ser Ser Asn Thr Ala Tyr LeuHis Leu Ser Ser Leu Thr Ser Glu Asp 100 105 110 Thr Ala Val Tyr Tyr CysHis Val Leu Ile Tyr Ala Gly Tyr Leu Ala 115 120 125 Met Asp Tyr Trp GlyGln Gly Thr Ser Val Ala Val Ser Ser Gly Gly 130 135 140 Gly Gly Ser GlyGly Gly Gly Ser Gly Gly Gly Gly Ser Gln Ile Val 145 150 155 160 Leu ThrGln Ser Pro Ala Ile Met Ser Ala Ser Pro Gly Glu Lys Val 165 170 175 ThrIle Thr Cys Ser Ala Ser Ser Ser Val Thr Tyr Met His Trp Phe 180 185 190Gln Gln Lys Pro Gly Thr Ser Pro Lys Leu Trp Ile Tyr Ser Thr Ser 195 200205 Asn Leu Ala Ser Gly Val Pro Ala Arg Phe Ser Gly Ser Gly Ser Gly 210215 220 Thr Ser Tyr Ser Leu Thr Ile Ser Arg Met Glu Ala Glu Asp Ala Ala225 230 235 240 Thr Tyr Tyr Cys Gln Gln Arg Ser Thr Tyr Pro Leu Thr PheGly Ala 245 250 255 Gly Thr Lys Leu Glu Ile Lys Arg Glu Gln Lys Leu IleSer Glu Glu 260 265 270 Asp Leu Asn His His His His His His 275 280 72base pairs nucleic acid single linear other nucleic acid 45 TCGAGATCAAACGGGAACAA AAACTCATCT CAGAAGAAGA TCTGAATCAC CACCATCACC 60 ACCATTAATG AG72 72 base pairs nucleic acid single linear other nucleic acid 46AATTCTCATT AATGGTGGTG ATGGTGGTGA TTCAGATCTT CTTCTGAGAT GAGTTTTTGT 60TCCCGTTTGA TC 72 864 base pairs nucleic acid single linear other nucleicacid 47 ATGAAATACC TATTGCCTAC GGCAGCCGCT GGATTGTTAT TACTCGCTGCCCAACCAGCC 60 ATGGCCCAGG TCCAACTGCA GCAGCCTGGG GCTGAACTGG TGAAGCCTGGGGCTTCAGTG 120 CAGCTGTCCT GCAAGGCTTC TGGCTACACC TTCACCGGCT ACTGGATACACTGGGTGAAG 180 CAGAGGCCTG GACAAGGCCT TGAGTGGATT GGAGAGGTTA ATCCTAGTACCGGTCGTTCT 240 GACTACAATG AGAAGTTCAA GAACAAGGCC ACACTGACTG TAGACAAATCCTCCACCACA 300 GCCTACATGC AACTCAGCAG CCTGACATCT GAGGACTCTG CGGTCTATTACTGTGCAAGA 360 GAGAGGGCCT ATGGTTACGA CGATGCTATG GACTACTGGG GCCAAGGGACCACGGTCACC 420 GTCTCCTCAG GTGGCGGTGG CTCGGGCGGT GGTGGGTCGG GTGGCGGCGGATCTGACATT 480 GAGCTCTCAC AGTCTCCATC CTCCCTGGCT GTGTCAGCAG GAGAGAAGGTCACCATGAGC 540 TGCAAATCCA GTCAGAGTCT CCTCAACAGT AGAACCCGAA AGAACTACTTGGCTTGGTAC 600 CAGCAGAGAC CAGGGCAGTC TCCTAAACTG CTGATCTATT GGGCATCCACTAGGACATCT 660 GGGGTCCCTG ATCGCTTCAC AGGCAGTGGA TCTGGGACAG ATTTCACTCTCACCATCAGC 720 AGTGTGCAGG CTGAAGACCT GGCAATTTAT TACTGCAAGC AATCTTATACTCTTCGGACG 780 TTCGGTGGAG GCACCAAGCT CGAGATCAAA CGGGAACAAA AACTCATCTCAGAAGAAGAT 840 CTGAATCACC ACCATCACCA CCAT 864 34 base pairs nucleic acidsingle linear other nucleic acid 48 AAGCTTGGAA TTCAGTGTGA GGTGCAGCTGCAGC 34 45 base pairs nucleic acid single linear other nucleic acid 49CGCCACCTCC GGAGCCACCA CCGCCCCGTT TGATCTCGAG CTTGG 45 1998 base pairsnucleic acid single linear other nucleic acid 50 ATGAAGTTGT GGCTGAACTGGATTTTCCTT GTAACACTTT TAAATGGAAT TCAGTGTGAG 60 GTGCAGCTGC AGCAGTCTGGGGCAGAGCTT GTGAGGTCAG GGGCCTCAGT CAAGTTGTCC 120 TGCACAGCTT CTGGCTTCAACATTAAAGAC AACTATATGC ACTGGGTGAA GCAGAGGCCT 180 GAACAGGGCC TGGAGTGGATTGCATGGATT GATCCTGAGA ATGGTGATAC TGAATATGCC 240 CCGAAGTTCC GGGGCAAGGCCACTTTGACT GCAGACTCAT CCTCCAACAC AGCCTACCTG 300 CACCTCAGCA GCCTGACATCTGAGGACACT GCCGTCTATT ACTGTCATGT CCTGATCTAT 360 GCTGGTTATT TGGCTATGGACTACTGGGGT CAAGGAACCT CAGTCGCCGT GAGCTCGGGT 420 GGCGGTGGCT CGGGCGGTGGTGGGTCGGGT GGCGGCGGAT CTCAGATTGT GCTCACCCAG 480 TCTCCAGCAA TCATGTCTGCATCTCCAGGG GAGAAGGTCA CCATAACCTG CAGTGCCAGC 540 TCAAGTGTAA CTTACATGCACTGGTTCCAG CAGAAGCCAG GCACTTCTCC CAAACTCTGG 600 ATTTATAGCA CATCCAACCTGGCTTCTGGA GTCCCTGCTC GCTTCAGTGG CAGTGGATCT 660 GGGACCTCTT ACTCTCTCACAATCAGCCGA ATGGAGGCTG AAGATGCTGC CACTTATTAC 720 TGCCAGCAAA GGAGTACTTACCCGCTCACG TTCGGTGCTG GGACCAAGCT CGAGATCAAA 780 CGGGGCGGTG GTGGCTCCGGAGGTGGCGGT AGCGGTGGCG GGGGTTCCCA GAAGCGCGAC 840 AACGTGCTGT TCCAGGCAGCTACCGACGAG CAGCCGGCCG TGATCAAGAC GCTGGAGAAG 900 CTGGTCAACA TCGAGACCGGCACCGGTGAC GCCGAGGGCA TCGCCGCTGC GGGCAACTTC 960 CTCGAGGCCG AGCTCAAGAACCTCGGCTTC ACGGTCACGC GAAGCAAGTC GGCCGGCCTG 1020 GTGGTGGGCG ACAACATCGTGGGCAAGATC AAGGGCCGCG GCGGCAAGAA CCTGCTGCTG 1080 ATGTCGCACA TGGACACCGTCTACCTCAAG GGCATTCTCG CGAAGGCCCC GTTCCGCGTC 1140 GAAGGCGACA AGGCCTACGGCCCGGGCATC GCCGACGACA AGGGCGGCAA CGCGGTCATC 1200 CTGCACACGC TCAAGCTGCTGAAGGAATAC GGCGTGCGCG ACTACGGCAC CATCACCGTG 1260 CTGTTCAACA CCGACGAGGAAAAGGGTTCC TTCGGCTCGC GCGACCTGAT CCAGGAAGAA 1320 GCCAAGCTGG CCGACTACGTGCTCTCCTTC GAGCCCACCA GCGCAGGCGA CGAAAAACTC 1380 TCGCTGGGCA CCTCGGGCATCGCCTACGTG CAGGTCCAGA TCACCGGCAA GGCCTCGCAT 1440 GCCGGCGCCG CGCCCGAGCTGGGCGTGAAC GCGCTGGTCG AGGCTTCCGA CCTCGTGCTG 1500 CGCACGATGA ACATCGACGACAAGGCGAAG AACCTGCGCT TCCAGTGGAC CATCGCCAAG 1560 GCCGGCCAGG TCTCGAACATCATCCCCGCC AGCGCCACGC TGAACGCCGA CGTGCGCTAC 1620 GCGCGCAACG AGGACTTCGACGCCGCCATG AAGACGCTGG AAGAGCGCGC GCAGCAGAAG 1680 AAGCTGCCCG AGGCCGACGTGAAGGTGATC GTCACGCGCG GCCGCCCGGC CTTCAATGCC 1740 GGCGAAGGCG GCAAGAAGCTGGTCGACAAG GCGGTGGCCT ACTACAAGGA AGCCGGCGGC 1800 ACGCTGGGCG TGGAAGAGCGCACCGGCGGC GGCACCGACG CGGCCTACGC CGCGCTCTCA 1860 GGCAAGCCAG TGATCGAGAGCCTGGGCCTG CCGGGCTTCG GCTACCACAG CGACAAGGCC 1920 GAGTACGTGG ACATCAGCGCGATTCCGCGC CGCCTGTACA TGGCTGCGCG CCTGATCATG 1980 GATCTGGGCG CCGGCAAG1998 666 amino acids amino acid single linear protein 51 Met Lys Leu TrpLeu Asn Trp Ile Phe Leu Val Thr Leu Leu Asn Gly 1 5 10 15 Ile Gln CysGlu Val Gln Leu Gln Gln Ser Gly Ala Glu Leu Val Arg 20 25 30 Ser Gly AlaSer Val Lys Leu Ser Cys Thr Ala Ser Gly Phe Asn Ile 35 40 45 Lys Asp AsnTyr Met His Trp Val Lys Gln Arg Pro Glu Gln Gly Leu 50 55 60 Glu Trp IleAla Trp Ile Asp Pro Glu Asn Gly Asp Thr Glu Tyr Ala 65 70 75 80 Pro LysPhe Arg Gly Lys Ala Thr Leu Thr Ala Asp Ser Ser Ser Asn 85 90 95 Thr AlaTyr Leu His Leu Ser Ser Leu Thr Ser Glu Asp Thr Ala Val 100 105 110 TyrTyr Cys His Val Leu Ile Tyr Ala Gly Tyr Leu Ala Met Asp Tyr 115 120 125Trp Gly Gln Gly Thr Ser Val Ala Val Ser Ser Gly Gly Gly Gly Ser 130 135140 Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gln Ile Val Leu Thr Gln 145150 155 160 Ser Pro Ala Ile Met Ser Ala Ser Pro Gly Glu Lys Val Thr IleThr 165 170 175 Cys Ser Ala Ser Ser Ser Val Thr Tyr Met His Trp Phe GlnGln Lys 180 185 190 Pro Gly Thr Ser Pro Lys Leu Trp Ile Tyr Ser Thr SerAsn Leu Ala 195 200 205 Ser Gly Val Pro Ala Arg Phe Ser Gly Ser Gly SerGly Thr Ser Tyr 210 215 220 Ser Leu Thr Ile Ser Arg Met Glu Ala Glu AspAla Ala Thr Tyr Tyr 225 230 235 240 Cys Gln Gln Arg Ser Thr Tyr Pro LeuThr Phe Gly Ala Gly Thr Lys 245 250 255 Leu Glu Ile Lys Arg Gly Gly GlyGly Ser Gly Gly Gly Gly Ser Gly 260 265 270 Gly Gly Gly Ser Gln Lys ArgAsp Asn Val Leu Phe Gln Ala Ala Thr 275 280 285 Asp Glu Gln Pro Ala ValIle Lys Thr Leu Glu Lys Leu Val Asn Ile 290 295 300 Glu Thr Gly Thr GlyAsp Ala Glu Gly Ile Ala Ala Ala Gly Asn Phe 305 310 315 320 Leu Glu AlaGlu Leu Lys Asn Leu Gly Phe Thr Val Thr Arg Ser Lys 325 330 335 Ser AlaGly Leu Val Val Gly Asp Asn Ile Val Gly Lys Ile Lys Gly 340 345 350 ArgGly Gly Lys Asn Leu Leu Leu Met Ser His Met Asp Thr Val Tyr 355 360 365Leu Lys Gly Ile Leu Ala Lys Ala Pro Phe Arg Val Glu Gly Asp Lys 370 375380 Ala Tyr Gly Pro Gly Ile Ala Asp Asp Lys Gly Gly Asn Ala Val Ile 385390 395 400 Leu His Thr Leu Lys Leu Leu Lys Glu Tyr Gly Val Arg Asp TyrGly 405 410 415 Thr Ile Thr Val Leu Phe Asn Thr Asp Glu Glu Lys Gly SerPhe Gly 420 425 430 Ser Arg Asp Leu Ile Gln Glu Glu Ala Lys Leu Ala AspTyr Val Leu 435 440 445 Ser Phe Glu Pro Thr Ser Ala Gly Asp Glu Lys LeuSer Leu Gly Thr 450 455 460 Ser Gly Ile Ala Tyr Val Gln Val Gln Ile ThrGly Lys Ala Ser His 465 470 475 480 Ala Gly Ala Ala Pro Glu Leu Gly ValAsn Ala Leu Val Glu Ala Ser 485 490 495 Asp Leu Val Leu Arg Thr Met AsnIle Asp Asp Lys Ala Lys Asn Leu 500 505 510 Arg Phe Gln Trp Thr Ile AlaLys Ala Gly Gln Val Ser Asn Ile Ile 515 520 525 Pro Ala Ser Ala Thr LeuAsn Ala Asp Val Arg Tyr Ala Arg Asn Glu 530 535 540 Asp Phe Asp Ala AlaMet Lys Thr Leu Glu Glu Arg Ala Gln Gln Lys 545 550 555 560 Lys Leu ProGlu Ala Asp Val Lys Val Ile Val Thr Arg Gly Arg Pro 565 570 575 Ala PheAsn Ala Gly Glu Gly Gly Lys Lys Leu Val Asp Lys Ala Val 580 585 590 AlaTyr Tyr Lys Glu Ala Gly Gly Thr Leu Gly Val Glu Glu Arg Thr 595 600 605Gly Gly Gly Thr Asp Ala Ala Tyr Ala Ala Leu Ser Gly Lys Pro Val 610 615620 Ile Glu Ser Leu Gly Leu Pro Gly Phe Gly Tyr His Ser Asp Lys Ala 625630 635 640 Glu Tyr Val Asp Ile Ser Ala Ile Pro Arg Arg Leu Tyr Met AlaAla 645 650 655 Arg Leu Ile Met Asp Leu Gly Ala Gly Lys 660 665 3217base pairs nucleic acid single linear other nucleic acid 52 GAATTCGCCGCCACTATGGA TTTTCAAGTG CAGATTTTCA GCTTCCTGCT AATCAGTGCT 60 TCAGTCATAATGTCCAGAGG ACAAACTGTT CTCTCCCAGT CTCCAGCAAT CCTGTCTGCA 120 TCTCCAGGGGAGAAGGTCAC AATGACTTGC AGGGCCAGCT CAAGTGTAAC TTACATTCAC 180 TGGTACCAGCAGAAGCCAGG TTCCTCCCCC AAATCCTGGA TTTATGCCAC ATCCAACCTG 240 GCTTCTGGAGTCCCTGCTCG CTTCAGTGGC AGTGGGTCTG GGACCTCTTA CTCTCTCACA 300 ATCAGCAGAGTGGAGGCTGA AGATGCTGCC ACTTATTACT GCCAACATTG GAGTAGTAAA 360 CCACCGACGTTCGGTGGAGG CACCAAGCTC GAGATCAAAC GGACTGTGGC TGCACCATCT 420 GTCTTCATCTTCCCGCCATC TGATGAGCAG TTGAAATCTG GAACTGCCTC TGTTGTGTGC 480 CTGCTGAATAACTTCTATCC CAGAGAGGCC AAAGTACAGT GGAAGGTGGA TAACGCCCTC 540 CAATCGGGTAACTCCCAGGA GAGTGTCACA GAGCAGGACA GCAAGGACAG CACCTACAGC 600 CTCAGCAGCACCCTGACGCT GAGCAAAGCA GACTACGAGA AACACAAAGT CTACGCCTGC 660 GAAGTCACCCATCAGGGCCT GAGTTCGCCC GTCACAAAGA GCTTCAACAG GGGAGAGTGT 720 TAATAGGAGCTCGGATCCAG ATCTGAGCTC CTGTAGACGT CGACATTAAT TCCGGTTATT 780 TTCCACCATATTGCCGTCTT TTGGCAATGT GAGGGCCCGG AAACCTGGCC CTGTCTTCTT 840 GACGAGCATTCCTAGGGGTC TTTCCCCTCT CGCCAAAGGA ATGCAAGGTC TGTTGAATGT 900 CGTGAAGGAAGCAGTTCCTC TGGAAGCTTC TTGAAGACAA ACAACGTCTG TAGCGACCCT 960 TTGCAGGCAGCGGAACCCCC CACCTGGCGA CAGGTGCCTC TGCGGCCAAA AGCCACGTGT 1020 ATAAGATACACCTGCAAAGG CGGCACAACC CCAGTGCCAC GTTGTGAGTT GGATAGTTGT 1080 GGAAAGAGTCAAATGGCTCT CCTCAAGCGT ATTCAACAAG GGGCTGAAGG ATGCCCAGAA 1140 GGTACCCCATTGTATGGGAT CTGATCTGGG GCCTCGGTGC ACATGCTTTA CATGTGTTTA 1200 GTCGAGGTTAAAAAACGTCT AGGCCCCCCG AACCACGGGG ACGTGGTTTT CCTTTGAAAA 1260 ACACGATGATAATACCATGG AGTTGTGGCT GAACTGGATT TTCCTTGTAA CACTTTTAAA 1320 TGGTATCCAGTGTGAGGTGA AGCTGGTGGA GTCTGGAGGA GGCTTGGTAC AGCCTGGGGG 1380 TTCTCTGAGACTCTCCTGTG CAACTTCTGG GTTCACCTTC ACTGATTACT ACATGAACTG 1440 GGTCCGCCAGCCTCCAGGAA AGGCACTTGA GTGGTTGGGT TTTATTGGAA ACAAAGCTAA 1500 TGGTTACACAACAGAGTACA GTGCATCTGT GAAGGGTCGG TTCACCATCT CCAGAGATAA 1560 ATCCCAAAGCATCCTCTATC TTCAAATGAA CACCCTGAGA GCTGAGGACA GTGCCACTTA 1620 TTACTGTACAAGAGATAGGG GGCTACGGTT CTACTTTGAC TACTGGGGCC AAGGCACCAC 1680 TCTCACAGTGAGCTCGGCTA GCACCAAGGG ACCATCGGTC TTCCCCCTGG CCCCCTGCTC 1740 CAGGAGCACCTCCGAGAGCA CAGCCGCCCT GGGCTGCCTG GTCAAGGACT ACTTCCCCGA 1800 ACCGGTGACGGTGTCGTGGA ACTCAGGCGC TCTGACCAGC GGCGTGCACA CCTTCCCGGC 1860 TGTCCTACAGTCCTCAGGAC TCTACTCCCT CAGCAGCGTC GTGACGGTGC CCTCCAGCAA 1920 CTTCGGCACCCAGACCTACA CCTGCAACGT AGATCACAAG CCCAGCAACA CCAAGGTGGA 1980 CAAGACAGTTGGCGGTGGTG GCTCTGGTGG TGGCGGTAGC GGTGGCGGGG GTTCCCAGAA 2040 GCGCGACAACGTGCTGTTCC AGGCAGCTAC CGACGAGCAG CCGGCCGTGA TCAAGACGCT 2100 GGAGAAGCTGGTCAACATCG AGACCGGCAC CGGTGACGCC GAGGGCATCG CCGCTGCGGG 2160 CAACTTCCTCGAGGCCGAGC TCAAGAACCT CGGCTTCACG GTCACGCGAA GCAAGTCGGC 2220 CGGCCTGGTGGTGGGCGACA ACATCGTGGG CAAGATCAAG GGCCGCGGCG GCAAGAACCT 2280 GCTGCTGATGTCGCACATGG ACACCGTCTA CCTCAAGGGC ATTCTCGCGA AGGCCCCGTT 2340 CCGCGTCGAAGGCGACAAGG CCTACGGCCC GGGCATCGCC GACGACAAGG GCGGCAACGC 2400 GGTCATCCTGCACACGCTCA AGCTGCTGAA GGAATACGGC GTGCGCGACT ACGGCACCAT 2460 CACCGTGCTGTTCAACACCG ACGAGGAAAA GGGTTCCTTC GGCTCGCGCG ACCTGATCCA 2520 GGAAGAAGCCAAGCTGGCCG ACTACGTGCT CTCCTTCGAG CCCACCAGCG CAGGCGACGA 2580 AAAACTCTCGCTGGGCACCT CGGGCATCGC CTACGTGCAG GTCCAGATCA CCGGCAAGGC 2640 CTCGCATGCCGGCGCCGCGC CCGAGCTGGG CGTGAACGCG CTGGTCGAGG CTTCCGACCT 2700 CGTGCTGCGCACGATGAACA TCGACGACAA GGCGAAGAAC CTGCGCTTCC AGTGGACCAT 2760 CGCCAAGGCCGGCCAGGTCT CGAACATCAT CCCCGCCAGC GCCACGCTGA ACGCCGACGT 2820 GCGCTACGCGCGCAACGAGG ACTTCGACGC CGCCATGAAG ACGCTGGAAG AGCGCGCGCA 2880 GCAGAAGAAGCTGCCCGAGG CCGACGTGAA GGTGATCGTC ACGCGCGGCC GCCCGGCCTT 2940 CAATGCCGGCGAAGGCGGCA AGAAGCTGGT CGACAAGGCG GTGGCCTACT ACAAGGAAGC 3000 CGGCGGCACGCTGGGCGTGG AAGAGCGCAC CGGCGGCGGC ACCGACGCGG CCTACGCCGC 3060 GCTCTCAGGCAAGCCAGTGA TCGAGAGCCT GGGCCTGCCG GGCTTCGGCT ACCACAGCGA 3120 CAAGGCCGAGTACGTGGACA TCAGCGCGAT TCCGCGCCGC CTGTACATGG CTGCGCGCCT 3180 GATCATGGATCTGGGCGCCG GCAAGTGATA ATCTAGA 3217 35 base pairs nucleic acid singlelinear other nucleic acid 53 TGGATCTGAA GCTTAAACTA ACTCCATGGT GACCC 3561 base pairs nucleic acid single linear other nucleic acid 54GCCACGGATC CCGCCACCTC CGGAGCCACC ACCGCCACAA TCCCTGGGCA CAATTTTCTT 60 G61 94 base pairs nucleic acid single linear other nucleic acid 55GCCCAGGAAG CTTGGCGGTG GTGGCTCCGG AGGTGGCGGT AGCGGTGGCG GGGGTTCCCA 60GAAGCGCGAC AACGTGCTGT TCCAGGCAGC TACC 94 51 base pairs nucleic acidsingle linear other nucleic acid 56 ATGTGCGAAT TCAGCAGCAG GTTCTTGCCGCCGCGGCCCT TGATCTTGCC C 51 732 base pairs nucleic acid single linearother nucleic acid CDS 16..720 57 GAATTCGCCG CCACC ATG GAT TTT CAA GTGCAG ATT TTC AGC TTC CTG CTA 51 Met Asp Phe Gln Val Gln Ile Phe Ser PheLeu Leu 1 5 10 ATC AGT GCT TCA GTC ATA ATG TCC AGA GGA CAA ACT GTT CTCTCC CAG 99 Ile Ser Ala Ser Val Ile Met Ser Arg Gly Gln Thr Val Leu SerGln 15 20 25 TCT CCA GCA ATC CTG TCT GCA TCT CCA GGG GAG AAG GTC ACA ATGACT 147 Ser Pro Ala Ile Leu Ser Ala Ser Pro Gly Glu Lys Val Thr Met Thr30 35 40 TGC AGG GCC AGC TCA AGT GTA ACT TAC ATT CAC TGG TAC CAG CAG AAG195 Cys Arg Ala Ser Ser Ser Val Thr Tyr Ile His Trp Tyr Gln Gln Lys 4550 55 60 CCA GGT TCC TCC CCC AAA TCC TGG ATT TAT GCC ACA TCC AAC CTG GCT243 Pro Gly Ser Ser Pro Lys Ser Trp Ile Tyr Ala Thr Ser Asn Leu Ala 6570 75 TCT GGA GTC CCT GCT CGC TTC AGT GGC AGT GGG TCT GGG ACC TCT TAC291 Ser Gly Val Pro Ala Arg Phe Ser Gly Ser Gly Ser Gly Thr Ser Tyr 8085 90 TCT CTC ACA ATC AGC AGA GTG GAG GCT GAA GAT GCT GCC ACT TAT TAC339 Ser Leu Thr Ile Ser Arg Val Glu Ala Glu Asp Ala Ala Thr Tyr Tyr 95100 105 TGC CAA CAT TGG AGT AGT AAA CCA CCG ACG TTC GGT GGA GGC ACC AAG387 Cys Gln His Trp Ser Ser Lys Pro Pro Thr Phe Gly Gly Gly Thr Lys 110115 120 CTG GAA ATC AAA CGG GCT GAT GCT GCA CCA ACT GTA TCC ATC TTC CCA435 Leu Glu Ile Lys Arg Ala Asp Ala Ala Pro Thr Val Ser Ile Phe Pro 125130 135 140 CCA TCC AGT GAG CAG TTA ACA TCT GGA GGT GCC TCA GTC GTG TGCTTC 483 Pro Ser Ser Glu Gln Leu Thr Ser Gly Gly Ala Ser Val Val Cys Phe145 150 155 TTG AAC AAC TTC TAC CCC AAA GAC ATC AAT GTC AAG TGG AAG ATTGAT 531 Leu Asn Asn Phe Tyr Pro Lys Asp Ile Asn Val Lys Trp Lys Ile Asp160 165 170 GGC AGT GAA CGA CAA AAT GGC GTC CTG AAC AGT TGG ACT GAT CAGGAC 579 Gly Ser Glu Arg Gln Asn Gly Val Leu Asn Ser Trp Thr Asp Gln Asp175 180 185 AGC AAA GAC AGC ACC TAC AGC ATG AGC AGC ACC CTC ACG TTG ACCAAG 627 Ser Lys Asp Ser Thr Tyr Ser Met Ser Ser Thr Leu Thr Leu Thr Lys190 195 200 GAC GAG TAT GAA CGA CAT AAC AGC TAT ACC TGT GAG GCC ACT CACAAG 675 Asp Glu Tyr Glu Arg His Asn Ser Tyr Thr Cys Glu Ala Thr His Lys205 210 215 220 ACA TCA ACT TCA CCC ATT GTC AAG AGC TTC AAC AGG AAT GAGTGT 720 Thr Ser Thr Ser Pro Ile Val Lys Ser Phe Asn Arg Asn Glu Cys 225230 235 TAATAAGAAT TC 732 235 amino acids amino acid linear protein 58Met Asp Phe Gln Val Gln Ile Phe Ser Phe Leu Leu Ile Ser Ala Ser 1 5 1015 Val Ile Met Ser Arg Gly Gln Thr Val Leu Ser Gln Ser Pro Ala Ile 20 2530 Leu Ser Ala Ser Pro Gly Glu Lys Val Thr Met Thr Cys Arg Ala Ser 35 4045 Ser Ser Val Thr Tyr Ile His Trp Tyr Gln Gln Lys Pro Gly Ser Ser 50 5560 Pro Lys Ser Trp Ile Tyr Ala Thr Ser Asn Leu Ala Ser Gly Val Pro 65 7075 80 Ala Arg Phe Ser Gly Ser Gly Ser Gly Thr Ser Tyr Ser Leu Thr Ile 8590 95 Ser Arg Val Glu Ala Glu Asp Ala Ala Thr Tyr Tyr Cys Gln His Trp100 105 110 Ser Ser Lys Pro Pro Thr Phe Gly Gly Gly Thr Lys Leu Glu IleLys 115 120 125 Arg Ala Asp Ala Ala Pro Thr Val Ser Ile Phe Pro Pro SerSer Glu 130 135 140 Gln Leu Thr Ser Gly Gly Ala Ser Val Val Cys Phe LeuAsn Asn Phe 145 150 155 160 Tyr Pro Lys Asp Ile Asn Val Lys Trp Lys IleAsp Gly Ser Glu Arg 165 170 175 Gln Asn Gly Val Leu Asn Ser Trp Thr AspGln Asp Ser Lys Asp Ser 180 185 190 Thr Tyr Ser Met Ser Ser Thr Leu ThrLeu Thr Lys Asp Glu Tyr Glu 195 200 205 Arg His Asn Ser Tyr Thr Cys GluAla Thr His Lys Thr Ser Thr Ser 210 215 220 Pro Ile Val Lys Ser Phe AsnArg Asn Glu Cys 225 230 235 1974 base pairs nucleic acid single linearother nucleic acid CDS 16..1956 59 AAGCTTGCCG CCACC ATG AAG TTG TGG CTGAAC TGG ATT TTC CTT GTA ACA 51 Met Lys Leu Trp Leu Asn Trp Ile Phe LeuVal Thr 1 5 10 CTT TTA AAT GGT ATC CAG TGT GAG GTG AAG CTG GTG GAG TCTGGA GGA 99 Leu Leu Asn Gly Ile Gln Cys Glu Val Lys Leu Val Glu Ser GlyGly 15 20 25 GGC TTG GTA CAG CCT GGG GGT TCT CTG AGA CTC TCC TGT GCA ACTTCT 147 Gly Leu Val Gln Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Thr Ser30 35 40 GGG TTC ACC TTC ACT GAT TAC TAC ATG AAC TGG GTC CGC CAG CCT CCA195 Gly Phe Thr Phe Thr Asp Tyr Tyr Met Asn Trp Val Arg Gln Pro Pro 4550 55 60 GGA AAG GCA CTT GAG TGG TTG GGT TTT ATT GGA AAC AAA GCT AAT GGT243 Gly Lys Ala Leu Glu Trp Leu Gly Phe Ile Gly Asn Lys Ala Asn Gly 6570 75 TAC ACA ACA GAG TAC AGT GCA TCT GTG AAG GGT CGG TTC ACC ATC TCC291 Tyr Thr Thr Glu Tyr Ser Ala Ser Val Lys Gly Arg Phe Thr Ile Ser 8085 90 AGA GAT AAA TCC CAA AGC ATC CTC TAT CTT CAA ATG AAC ACC CTG AGA339 Arg Asp Lys Ser Gln Ser Ile Leu Tyr Leu Gln Met Asn Thr Leu Arg 95100 105 GCT GAG GAC AGT GCC ACT TAT TAC TGT ACA AGA GAT AGG GGG CTA CGG387 Ala Glu Asp Ser Ala Thr Tyr Tyr Cys Thr Arg Asp Arg Gly Leu Arg 110115 120 TTC TAC TTT GAC TAC TGG GGC CAA GGC ACC ACT CTC ACA GTC TCC TCA435 Phe Tyr Phe Asp Tyr Trp Gly Gln Gly Thr Thr Leu Thr Val Ser Ser 125130 135 140 GCC AAA ACG ACA CCC CCA TCT GTC TAT CCA CTG GCC CCT GGA TCTGCT 483 Ala Lys Thr Thr Pro Pro Ser Val Tyr Pro Leu Ala Pro Gly Ser Ala145 150 155 GCC CAA ACT AAC TCC ATG GTG ACC CTG GGA TGC CTG GTC AAG GGCTAT 531 Ala Gln Thr Asn Ser Met Val Thr Leu Gly Cys Leu Val Lys Gly Tyr160 165 170 TTC CCT GAG CCA GTG ACA GTG ACC TGG AAC TCT GGA TCT CTG TCCAGC 579 Phe Pro Glu Pro Val Thr Val Thr Trp Asn Ser Gly Ser Leu Ser Ser175 180 185 GGT GTG CAC ACC TTC CCA GCT GTC CTG CAG TCT GAC CTC TAC ACTCTG 627 Gly Val His Thr Phe Pro Ala Val Leu Gln Ser Asp Leu Tyr Thr Leu190 195 200 AGC AGC TCA GTG ACT GTC CCC TCC AGC ACC TGG CCC AGC GAG ACCGTC 675 Ser Ser Ser Val Thr Val Pro Ser Ser Thr Trp Pro Ser Glu Thr Val205 210 215 220 ACC TGC AAC GTT GCC CAC CCG GCC AGC AGC ACC AAG GTG GACAAG AAA 723 Thr Cys Asn Val Ala His Pro Ala Ser Ser Thr Lys Val Asp LysLys 225 230 235 ATT GTG CCC AGG GAT TGT GGC GGT GGT GGC TCC GGA GGT GGCGGT AGC 771 Ile Val Pro Arg Asp Cys Gly Gly Gly Gly Ser Gly Gly Gly GlySer 240 245 250 GGT GGC GGG GGT TCC CAG AAG CGC GAC AAC GTG CTG TTC CAGGCA GCT 819 Gly Gly Gly Gly Ser Gln Lys Arg Asp Asn Val Leu Phe Gln AlaAla 255 260 265 ACC GAC GAG CAG CCG GCC GTG ATC AAG ACG CTG GAG AAG CTGGTC AAC 867 Thr Asp Glu Gln Pro Ala Val Ile Lys Thr Leu Glu Lys Leu ValAsn 270 275 280 ATC GAG ACC GGC ACC GGT GAC GCC GAG GGC ATC GCC GCT GCGGGC AAC 915 Ile Glu Thr Gly Thr Gly Asp Ala Glu Gly Ile Ala Ala Ala GlyAsn 285 290 295 300 TTC CTC GAG GCC GAG CTC AAG AAC CTC GGC TTC ACG GTCACG CGA AGC 963 Phe Leu Glu Ala Glu Leu Lys Asn Leu Gly Phe Thr Val ThrArg Ser 305 310 315 AAG TCG GCC GGC CTG GTG GTG GGC GAC AAC ATC GTG GGCAAG ATC AAG 1011 Lys Ser Ala Gly Leu Val Val Gly Asp Asn Ile Val Gly LysIle Lys 320 325 330 GGC CGC GGC GGC AAG AAC CTG CTG CTG ATG TCG CAC ATGGAC ACC GTC 1059 Gly Arg Gly Gly Lys Asn Leu Leu Leu Met Ser His Met AspThr Val 335 340 345 TAC CTC AAG GGC ATT CTC GCG AAG GCC CCG TTC CGC GTCGAA GGC GAC 1107 Tyr Leu Lys Gly Ile Leu Ala Lys Ala Pro Phe Arg Val GluGly Asp 350 355 360 AAG GCC TAC GGC CCG GGC ATC GCC GAC GAC AAG GGC GGCAAC GCG GTC 1155 Lys Ala Tyr Gly Pro Gly Ile Ala Asp Asp Lys Gly Gly AsnAla Val 365 370 375 380 ATC CTG CAC ACG CTC AAG CTG CTG AAG GAA TAC GGCGTG CGC GAC TAC 1203 Ile Leu His Thr Leu Lys Leu Leu Lys Glu Tyr Gly ValArg Asp Tyr 385 390 395 GGC ACC ATC ACC GTG CTG TTC AAC ACC GAC GAG GAAAAG GGT TCC TTC 1251 Gly Thr Ile Thr Val Leu Phe Asn Thr Asp Glu Glu LysGly Ser Phe 400 405 410 GGC TCG CGC GAC CTG ATC CAG GAA GAA GCC AAG CTGGCC GAC TAC GTG 1299 Gly Ser Arg Asp Leu Ile Gln Glu Glu Ala Lys Leu AlaAsp Tyr Val 415 420 425 CTC TCC TTC GAG CCC ACC AGC GCA GGC GAC GAA AAACTC TCG CTG GGC 1347 Leu Ser Phe Glu Pro Thr Ser Ala Gly Asp Glu Lys LeuSer Leu Gly 430 435 440 ACC TCG GGC ATC GCC TAC GTG CAG GTC AAC ATC ACCGGC AAG GCC TCG 1395 Thr Ser Gly Ile Ala Tyr Val Gln Val Asn Ile Thr GlyLys Ala Ser 445 450 455 460 CAT GCC GGC GCC GCG CCC GAG CTG GGC GTG AACGCG CTG GTC GAG GCT 1443 His Ala Gly Ala Ala Pro Glu Leu Gly Val Asn AlaLeu Val Glu Ala 465 470 475 TCC GAC CTC GTG CTG CGC ACG ATG AAC ATC GACGAC AAG GCG AAG AAC 1491 Ser Asp Leu Val Leu Arg Thr Met Asn Ile Asp AspLys Ala Lys Asn 480 485 490 CTG CGC TTC AAC TGG ACC ATC GCC AAG GCC GGCAAC GTC TCG AAC ATC 1539 Leu Arg Phe Asn Trp Thr Ile Ala Lys Ala Gly AsnVal Ser Asn Ile 495 500 505 ATC CCC GCC AGC GCC ACG CTG AAC GCC GAC GTGCGC TAC GCG CGC AAC 1587 Ile Pro Ala Ser Ala Thr Leu Asn Ala Asp Val ArgTyr Ala Arg Asn 510 515 520 GAG GAC TTC GAC GCC GCC ATG AAG ACG CTG GAAGAG CGC GCG CAG CAG 1635 Glu Asp Phe Asp Ala Ala Met Lys Thr Leu Glu GluArg Ala Gln Gln 525 530 535 540 AAG AAG CTG CCC GAG GCC GAC GTG AAG GTGATC GTC ACG CGC GGC CGC 1683 Lys Lys Leu Pro Glu Ala Asp Val Lys Val IleVal Thr Arg Gly Arg 545 550 555 CCG GCC TTC AAT GCC GGC GAA GGC GGC AAGAAG CTG GTC GAC AAG GCG 1731 Pro Ala Phe Asn Ala Gly Glu Gly Gly Lys LysLeu Val Asp Lys Ala 560 565 570 GTG GCC TAC TAC AAG GAA GCC GGC GGC ACGCTG GGC GTG GAA GAG CGC 1779 Val Ala Tyr Tyr Lys Glu Ala Gly Gly Thr LeuGly Val Glu Glu Arg 575 580 585 ACC GGC GGC GGC ACC GAC GCG GCC TAC GCCGCG CTC TCA GGC AAG CCA 1827 Thr Gly Gly Gly Thr Asp Ala Ala Tyr Ala AlaLeu Ser Gly Lys Pro 590 595 600 GTG ATC GAG AGC CTG GGC CTG CCG GGC TTCGGC TAC CAC AGC GAC AAG 1875 Val Ile Glu Ser Leu Gly Leu Pro Gly Phe GlyTyr His Ser Asp Lys 605 610 615 620 GCC GAG TAC GTG GAC ATC AGC GCG ATTCCG CGC CGC CTG TAC ATG GCT 1923 Ala Glu Tyr Val Asp Ile Ser Ala Ile ProArg Arg Leu Tyr Met Ala 625 630 635 GCG CGC CTG ATC ATG GAT CTG GGC GCCGGC AAG TGATAAGAAT TCCTCGAG 1974 Ala Arg Leu Ile Met Asp Leu Gly Ala GlyLys 640 645 647 amino acids amino acid linear protein 60 Met Lys Leu TrpLeu Asn Trp Ile Phe Leu Val Thr Leu Leu Asn Gly 1 5 10 15 Ile Gln CysGlu Val Lys Leu Val Glu Ser Gly Gly Gly Leu Val Gln 20 25 30 Pro Gly GlySer Leu Arg Leu Ser Cys Ala Thr Ser Gly Phe Thr Phe 35 40 45 Thr Asp TyrTyr Met Asn Trp Val Arg Gln Pro Pro Gly Lys Ala Leu 50 55 60 Glu Trp LeuGly Phe Ile Gly Asn Lys Ala Asn Gly Tyr Thr Thr Glu 65 70 75 80 Tyr SerAla Ser Val Lys Gly Arg Phe Thr Ile Ser Arg Asp Lys Ser 85 90 95 Gln SerIle Leu Tyr Leu Gln Met Asn Thr Leu Arg Ala Glu Asp Ser 100 105 110 AlaThr Tyr Tyr Cys Thr Arg Asp Arg Gly Leu Arg Phe Tyr Phe Asp 115 120 125Tyr Trp Gly Gln Gly Thr Thr Leu Thr Val Ser Ser Ala Lys Thr Thr 130 135140 Pro Pro Ser Val Tyr Pro Leu Ala Pro Gly Ser Ala Ala Gln Thr Asn 145150 155 160 Ser Met Val Thr Leu Gly Cys Leu Val Lys Gly Tyr Phe Pro GluPro 165 170 175 Val Thr Val Thr Trp Asn Ser Gly Ser Leu Ser Ser Gly ValHis Thr 180 185 190 Phe Pro Ala Val Leu Gln Ser Asp Leu Tyr Thr Leu SerSer Ser Val 195 200 205 Thr Val Pro Ser Ser Thr Trp Pro Ser Glu Thr ValThr Cys Asn Val 210 215 220 Ala His Pro Ala Ser Ser Thr Lys Val Asp LysLys Ile Val Pro Arg 225 230 235 240 Asp Cys Gly Gly Gly Gly Ser Gly GlyGly Gly Ser Gly Gly Gly Gly 245 250 255 Ser Gln Lys Arg Asp Asn Val LeuPhe Gln Ala Ala Thr Asp Glu Gln 260 265 270 Pro Ala Val Ile Lys Thr LeuGlu Lys Leu Val Asn Ile Glu Thr Gly 275 280 285 Thr Gly Asp Ala Glu GlyIle Ala Ala Ala Gly Asn Phe Leu Glu Ala 290 295 300 Glu Leu Lys Asn LeuGly Phe Thr Val Thr Arg Ser Lys Ser Ala Gly 305 310 315 320 Leu Val ValGly Asp Asn Ile Val Gly Lys Ile Lys Gly Arg Gly Gly 325 330 335 Lys AsnLeu Leu Leu Met Ser His Met Asp Thr Val Tyr Leu Lys Gly 340 345 350 IleLeu Ala Lys Ala Pro Phe Arg Val Glu Gly Asp Lys Ala Tyr Gly 355 360 365Pro Gly Ile Ala Asp Asp Lys Gly Gly Asn Ala Val Ile Leu His Thr 370 375380 Leu Lys Leu Leu Lys Glu Tyr Gly Val Arg Asp Tyr Gly Thr Ile Thr 385390 395 400 Val Leu Phe Asn Thr Asp Glu Glu Lys Gly Ser Phe Gly Ser ArgAsp 405 410 415 Leu Ile Gln Glu Glu Ala Lys Leu Ala Asp Tyr Val Leu SerPhe Glu 420 425 430 Pro Thr Ser Ala Gly Asp Glu Lys Leu Ser Leu Gly ThrSer Gly Ile 435 440 445 Ala Tyr Val Gln Val Asn Ile Thr Gly Lys Ala SerHis Ala Gly Ala 450 455 460 Ala Pro Glu Leu Gly Val Asn Ala Leu Val GluAla Ser Asp Leu Val 465 470 475 480 Leu Arg Thr Met Asn Ile Asp Asp LysAla Lys Asn Leu Arg Phe Asn 485 490 495 Trp Thr Ile Ala Lys Ala Gly AsnVal Ser Asn Ile Ile Pro Ala Ser 500 505 510 Ala Thr Leu Asn Ala Asp ValArg Tyr Ala Arg Asn Glu Asp Phe Asp 515 520 525 Ala Ala Met Lys Thr LeuGlu Glu Arg Ala Gln Gln Lys Lys Leu Pro 530 535 540 Glu Ala Asp Val LysVal Ile Val Thr Arg Gly Arg Pro Ala Phe Asn 545 550 555 560 Ala Gly GluGly Gly Lys Lys Leu Val Asp Lys Ala Val Ala Tyr Tyr 565 570 575 Lys GluAla Gly Gly Thr Leu Gly Val Glu Glu Arg Thr Gly Gly Gly 580 585 590 ThrAsp Ala Ala Tyr Ala Ala Leu Ser Gly Lys Pro Val Ile Glu Ser 595 600 605Leu Gly Leu Pro Gly Phe Gly Tyr His Ser Asp Lys Ala Glu Tyr Val 610 615620 Asp Ile Ser Ala Ile Pro Arg Arg Leu Tyr Met Ala Ala Arg Leu Ile 625630 635 640 Met Asp Leu Gly Ala Gly Lys 645

What is claimed is:
 1. A matched two component system designed for usein a mammalian host in which the components comprise: (i) a firstcomponent that comprises a gene construct encoding a cell targetingmoiety and a heterologous prodrug activating enzyme and a secretoryleader sequence for use as a medicament in a mammalian host wherein thegene construct is capable of expressing the cell targeting moiety andheterologous prodrug activating enzyme and the secretory leader sequenceas a conjugate within a cell in the mammalian host and wherein theconjugate is directed by the secretory leader sequence to leave the cellthereafter for selective localisation at a cell surface antigenrecognised by the cell targeting moiety; and (ii) a second componentthat comprises a prodrug which can be converted into a cytotoxic drug bythe heterologous enzyme encoded by the first component.
 2. The matchedtwo component system according to claim 1 in which the heterologousprodrug activating enzyme is a heterologous enzyme CPG2; and the secondcomponent prodrug is selected fromN-(4-[N,N-bis(2-iodoethyl)amino]-phenoxycarbonyl)-L-glutamic acid,N-(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic-gamma-(3,5-dicarboxy)anilideor N-(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic acidor a pharmaceutically acceptable salt thereof.
 3. The matched twocomponent system according to claim 1 wherein the cell targeting moietyis selectively localised to a cell surface antigen, and the cell surfaceantigen is specific for at least one human solid tumor.
 4. The matchedtwo component system according to claim 1 wherein the cell targetingmoiety is an antibody or fragment thereof.
 5. The matched two componentsystem according to claim 4 wherein the cell targeting moiety is ananti-CEA antibody selected from antibody A5B7 or 806.077 antibody. 6.The matched two component system according to claim 1 wherein theheterologous prodrug activating enzyme has mutated glycosylation sites.7. The matched two component system according to claim 1 wherein theheterologous prodrug activating enzyme is a carboxypeptidase.
 8. Thematched two component system according to claim 1 wherein theheterologous prodrug activating enzyme is a heterologous enzyme CPG2. 9.The matched two component system according to claim 8 in which theconjugate is a fusion protein in which the enzyme is fused to the Cterminus of an antibody through the heavy or light chain thereof wherebydimerisation of the encoded conjugate when expressed can take placethrough a dimerisation domain on CPG2.
 10. The matched two componentsystem according to claim 9 wherein the fusion protein is formed throughlinking a C-terminus of an antibody Fab heavy chain to an N-terminus ofa CPG2 molecule to form a Fab-CPG2 whereby two Fab-CPG2 molecules whenexpressed dimerise through CPG2 to form a (Fab-CPG2)₂ conjugate.
 11. Thematched two component system according to claim 1 wherein theheterologous prodrug activating enzyme is a carboxypeptidase mutantselected from the group consisting of [D253K]HCPB, [G251T,D253K]HCPB,and [A248S,G251T,D253K]HCPB.
 12. The matched two component systemaccording to claim 1 wherein the gene construct comprises atranscriptional regulatory sequence comprising a promoter and a controlelement which comprises a genetic switch to control expression of thegene construct.
 13. The matched two component system according to claim12 wherein the transcriptional regulatory sequence comprises a geneticswitch control element regulated by presence of tetracycline orecdysone.
 14. The matched two component system according to claim 12wherein the promoter is dependent on cell type and is selected from thegroup consisting of promoters of genes encoding carcinoembryonic antigen(CEA), alpha-foetoprotein (AFP), tyrosine hydroxylase, choline acetyltransferase, neurone specific enolase, insulin, glial fibro acidicprotein, HER-2/neu, c-erbB2, and N-myc.
 15. The matched two componentsystem according to claim 1 wherein the gene construct is packagedwithin an adenovirus.
 16. A method for making the two component systemaccording to claim 1 which comprises inserting genes encoding the celltargeting moiety, the heterologous prodrug activating enzyme, and thesecretory leader sequence into a vector.
 17. A method for the deliveryof a cytotoxic drug to a site which comprises administering to amammalian host the first component and the second component of thematched two component system according to claim 15 to the mammalianhost; wherein a prodrug of the second component can be converted into acytotoxic drug in the mammalian host by the heterologous enzyme encodedby the first component.
 18. The method according to claim 17 in whichthe heterologous prodrug activating enzyme is a heterologous enzymeCPG2; and the second component prodrug is selected fromN-(4-[N,N-bis(2-iodoethyl)amino]-phenoxycarbonyl)-L-glutamic acid,N-(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic-gamma-(3,5-dicarboxy)anilideor N-(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic acidor a pharmaceutically acceptable salt thereof.
 19. The method accordingto claim 17 wherein the cell targeting moiety is selectively localisedto a cell surface antigen, and the cell surface antigen is specific forat least one human solid tumor.
 20. The method according to claim 17wherein the cell targeting moiety is an antibody or fragment thereof.21. The method according to claim 20 wherein the cell targeting moietyis an anti-CEA antibody selected from antibody A5B7 or 806.077 antibody.22. The method according to claim 17 wherein the heterologous prodrugactivating enzyme has mutated glycosylation sites.
 23. The methodaccording to claim 17 wherein the heterologous prodrug activating enzymeis a carboxypeptidase.
 24. The method according to claim 17 wherein theheterologous prodrug activating enzyme is a heterologous enzyme CPG2.25. The method according to claim 24 in which the antibody-enzyme CPG2conjugate is a fusion protein in which the enzyme is fused to the Cterminus of the antibody through the heavy or light chain thereofwhereby dimerisation of the encoded conjugate when expressed can takeplace through a dimerisation domain on CPG2.
 26. The method according toclaim 25 wherein the fusion protein is formed through linking aC-terminus of an antibody Fab heavy chain to an N-terminus of a CPG2molecule to form a Fab-CPG2 whereby two Fab-CPG2 molecules whenexpressed dimerise through CPG2 to form a (Fab-CPG2)₂ conjugate.
 27. Themethod according to claim 17 wherein the heterologous prodrug activatingenzyme is a carboxypeptidase mutant selected from the group consistingof [D253K]HCPB, [G251T,D253K]HCPB, and [A248S,G251T,D253K] HCPB.
 28. Themethod according to claim 17 wherein the gene construct comprises atranscriptional regulatory sequence comprising a promoter and a controlelement which comprises a genetic switch to control expression of thegene construct.
 29. The method according to claim 28 wherein thetranscriptional regulatory sequence comprises a genetic switch controlelement regulated by presence of tetracycline or ecdysone.
 30. Themethod according to claim 28 wherein the promoter is dependent on celltype and is selected from the group consisting of promoters of genesencoding carcinoembryonic antigen (CEA), alpha-foetoprotein (AFP),tyrosine hydroxylase, choline acetyl transferase, neurone specificenolase, insulin, glial fibro acidic protein, HER-2/neu, c-erbB2, andN-myc.
 31. The method according to claim 17 wherein the gene constructis packaged within an adenovirus.
 32. A matched two component systemcomprising: (i) a first component that comprises a gene constructencoding a cell targeting moiety, a heterologous prodrug activatingenzyme, and a secretory leader sequence expressed as a conjugate; and(ii) a second component that comprises a prodrug which is converted intoa cytotoxic drug by the conjugate.
 33. The matched two component systemaccording to claim 32 wherein the cell targeting moiety is an antibodyor fragment thereof which is selectively localised to a cell surfaceantigen, and the cell surface antigen is specific for at least one humansolid tumor.
 34. A method for the delivery of a cytotoxic drug whichcomprises administering to a mammalian host: the first component and thesecond component of the matched two component system according to claim32 to the mammalian host; wherein the prodrug of the second component isconverted into a cytotoxic drug in the mammalian host by theheterologous enzyme encoded of the gene construct.